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PREFACE TO 
THE RUSSIAN EDITION 


Mathematics, which originated in antiquity in the needs of daily life, 
has developed into an immense System of widely varied disciplines. Like 
the other sciences, it reflects the laws of the material world around us 
and serves as a powerful instrument for our knowledge and mastery of 
nature. But the high level of abstraction peculiar to mathematics means 
that its newer branches are relatively inaccessible to nonspecialists. This 
abstract character of mathematics gave birth even in antiquity to 
idealistic notions about its independence of the material world. 

In preparing the présent volume, the authors hâve kept in mind the 
goal of acquainting a sufficiently wide circle of the Soviet intelligentsia 
with the various mathematical disciplines, their content and methods, 
the foundations on which they are based, and the paths along which 
they hâve developed. 

As a minimum of necessary mathematical knowledge on the part of 
the reader, we hâve assumed only secondary-school mathematics, but 
the volumes differ from one another with respect to the accessibility of 
the material contained in them. Readers wishing to acquaint themselves 
for the first time with the éléments of higher mathematics may profitably 
read the first few chapters, but for a complété understanding of the 
subséquent parts it will be necessary to hâve made some study of cor- 
responding textbooks. The book as a whole will be understood in a 
fundamental way only by readers who already hâve some acquaintance 
with the applications of mathematical analysis; that is to say, with the 
differential and intégral calculus. For such readers, namely teachers of 
mathematics and instructors in engineering and the natural sciences, it 
will be particularly important to read those chapters which introduce 
the newer branches of mathematics. 
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vi PREFACE TO THE RUSSIAN EDITION 

Naturally it has not been possible, within the limits of one book, to ex- 
haust ail the riches of even the most fundamental results of mathematical 
research; a certain freedom in the choice of material has been inévitable 
here. But along general lines, the présent book will give an idea of the 
présent State of mathematics, its origins, and its probable future develop¬ 
ment. For this reason the book is also intended to some extent for persons 
already acquainted with most of the factual material in it. It may perhaps 
help to remove a certain narrowness of outlook occasionally to be 
found in some of our younger mathematicians. 

The separate chapters of the book are written by various authors, 
whose names are given in the Contents. But as a whole the book is the 
resuit of collaboration. Its general plan, the choice of material, the suc¬ 
cessive versions of individual chapters, were ail submitted to general 
discussion, and improvements were made on the basis of a lively exchange 
of opinions. Mathematicians from several cities in the Soviet Union 
were given an opportunity, in the form of organized discussion, to make 
many valuable remarks concerning the original version of the text. Their 
opinions and suggestions were taken into account by the authors. 

The authors of some of the chapters also took a direct share in pre- 
paring the final version of other chapters: The introductory part of 
Chapter II was written essentially by B. N. Delone, while D. K. Faddeev 
played an active rôle in the préparation of Chapter IV and Chapter XX. 

A share in the work was also taken by several persons other than the 
authors of the individual chapters: §4 of Chapter XIV was written by 
L. V. Kantorovic, $6 of Chapter VI by O. A. Ladyzenskaja, §5 of 
Chapter 10 by A. G. Postnikov; work was done on the text of Chapter V 
by O. A. Olelnik and on Chapter XI by Ju. V. Prohorov. 

Certain sections of Chapters I, II, Vil, and XVII were written by 
V. A. Zalgaller. The editing of the final text was done by V. A. Zalgaller 
and V. S. Videnskiï with the coopération of T. V. Rogozkinaja and 
A. P. Leonovaja. 

The greater part of the illustrations were prepared by E. P. Sen'kin. 
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FOREWORD BY THE 
EDITOR OF THE TRANSLATION 


Mathematics, in view of its abstractness, offers greater difficulty to the 
expositor than any other science. Yet its rapidly increasing rôle in modem 
life créâtes both a need and a desire for good exposition. 

In recent years many popular books about mathematics hâve appeared 
in the English language, and some of them hâve enjoyed an immense 
sale. But for the most part they hâve contained little serious mathematical 
instruction, and many of them hâve neglected the twentieth century, the 
undisputed “golden âge” of mathematics. Although they are admirable 
in many other ways, they hâve not yet undertaken the ultimate task of 
mathematical exposition, namely the large-scale organization of modem 
mathematics in such a way that the reader is constantly delighted by the 
obvious economizing of his own time and effort. Anyone who reads 
through some of the chapters in the présent book will realize how well 
this task has been carried out by the Soviet authors, in the systematic 
collaboration they hâve described in their préfacé. 

Such a book, written for “a wide circle of the intelligentsia,” must also 
discuss the general cultural importance of mathematics and its continuous 
development from the earliest beginnings of history down to the présent 
day. To form an opinion of the book from this point of view the reader 
need only glance through the first chapter in Part 1 and the introduction 
to certain other chapters; for example. Analysis, or Analytic Geometry. 

In translating the passages on the history and cultural significance of 
mathematical ideas, the translators hâve naturally been aware of even 
greater difficultés than are usually associated with the translation of 
scientific texts. As organizer of the group, I express my profound grati¬ 
tude to the other two translators, Tamas Bartha and Kurt Hirsch, for 
their skillful coopération. 
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FOREWORD 


The présent translation, which was originally published by the Ameri¬ 
can Mathematical Society, will now enjoy a more general distribution in 
its new format. In thus making the book more widely available the 
Society has been influenced by various expressions of opinion from 
American mathematicians. For example, . . the book will contribute 
materially to a better understanding by the public of what mathematicians 
are up to. . . . It will be useful to many mathematicians, physicists and 
chemists, as well as to laymen. . . . Whether a physicist wishes to know 
what a Lie algebra is and how it is related to a Lie group, or an under- 
graduate would like to begin the study of homology, or a crystallographer 
is interested in Fedorov groups, or an engineer in probability, or any 
scientist in computing machines, he will find here a connected, lucid 
account.” 

In its first édition this translation has been widely read by mathemati¬ 
cians and students of mathematics. We now look forward to its wider 
usefulness in the general English-speaking world. 

August, 1964 

S. H. Gould 
Edit or of Translations 
American Mathematical Society 
Providence, Rhode Island 
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PART 1 



CHAPTER 


I 


A GENERAL VIEW 
OF MATHEMATICS 


An adéquate présentation of any science cannot consist of detailed 
information alone, however extensive. It must also provide a proper 
view of the essential nature of the science as a whole. The purpose of the 
présent chapter is to give a general picture of the essential nature of 
mathematics. For this purpose there is no great need to introduce any of 
the details of recent mathematical théories, since elementary mathematics 
and the history of the science already provide a sufficient foundation for 
general conclusions. 

§1. The Characteristic Features of Mathematics 

1. Abstractions, proofs, applications. With even a superficial knowl¬ 
edge of mathematics, it is easy to recognize certain characteristic features: 
its abstractness, its précision, its logical rigor, the indisputable character 
of its conclusions, and finally, the exceptionally broad range of its applica¬ 
tions. 

The abstractness of mathematics is easy to see. We operate with abstract 
numbers without worrying about how to relate them in each case to 
concrète objects. In school we study the abstract multiplication table, 
that is, a table for multiplying one abstract number by another, not a 
number of boys by a number of apples, or a number of apples by the 
price of an apple. 

Similarly in geometry we consider, for example, straight Unes and not 
stretched threads, the concept of a géométrie line being obtained by 
abstraction from ail other properties, excepting only extension in one 
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direction. More generally, the concept of a géométrie figure is the resuit 
of abstraction from ail the properties of actual objects except their spatial 
form and dimensions. 

Abstractions of this sort are characteristic for the whole of mathematics. 
The concept of a whole number and of a géométrie figure are only two of 
the earliest and most elementary of its concepts. They hâve been followed 
by a mass of others, too numerous to describe, extending to such abstrac¬ 
tions as complex numbers, functions, intégrais, difTerentials, functionals, 
n-dimensional, and even infinite-dimensional spaces, and so forth. These 
abstractions, piled up as it were on one another, hâve reached such a 
degree of generalization that they apparently lose ail connection with 
daily life and the “ordinary mortal” understands nothing about them 
beyond the mere fact that “ail this is incompréhensible.” 

In reality, of course, the case is not so at ail. Although the concept of 
n-dimepsional space is no doubt extremely abstract, yet it does hâve a 
completely real content, which is not very difficult to understand. In the 
présent book it will be our task to emphasize and clarify the concrète 
content of such abstract concepts as those mentioned earlier, so that the 
reader may convince himself that they are ail connected with actual life, 
both in their origin and in their applications. 

But abstraction is not the exclusive property of mathematics; it is 
characteristic of every science, even of ail mental activity in general. 
Consequently, the abstraetness of mathematical concepts does not in itself 
give a complété description of the peculiar character of mathematics.. 

The abstractions of mathematics are distinguished by three features. 
In the first place, they deal above ail else with quantitative relations and 
spatial forms, abstracting them from ail other properties of objects. Second, 
they occur in a sequence of increasing degrees of abstraction, going very 
much further in this direction than the abstractions of other sciences. We 
will illustrate these two features in detail later, using as examples the 
fundartiental notions of number and figure. Finally, and this is obvious, 
mathematics as such moves almost wholly in the field of abstract concepts 
and their interrelations. While the natural scientist turns constantly to 
experiment for proof of his assertions, the mathematician employs only 
argument and computation. 

It is true that mathematicians also make constant use, to assist them in 
the discovery of their theorems and methods, of models and physical 
analogues, and they hâve recourse to various completely concrète 
examples. These examples serve as the actual source of the theory and as 
a means of discovering its theorems, but no theorem definitely belongs 
to mathematics until it has been rigorously proved by a logical argument. 
If a geometer, reporting a newly discovered theorem, were to demonstrate 
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it by means of models and to confine himself to such a démonstration, no 
mathematician would admit that the theorem had been proved. The 
demand for a proof of a theorem is well known in high school geometry, 
but it pervades the whole of mathematics. We could measure the angles 
at the base of a thousand isosceles triangles with extreme accuracy, but 
such a procedure would never provide us with a mathematical proof of the 
theorem that the angles at the base of an isosceles triangle are equal. 
Mathematics demands that this resuit be deduced from the fundamental 
concepts of geometry, which at the présent time, in view of the fact that 
geometry is nowadays developed on a rigorous basis, are precisely 
formulated in the axioms. And so it is in every case. To prove a theorem 
means for the mathematician to deduce it by a logical argument from the 
fundamental properties of the concepts occuring in that theorem. In this 
way, not only the concepts but also the methods of mathematics are 
abstract and theoretical. 

The results of mathematics are distinguished by a high degree of logical 
rigor, and a mathematical argument is conducted with such scrupulousness 
as to make it incontestable and completely convincing to anyone who 
understands it. The scrupulousness and cogency of mathematical proofs 
are already well known in a high school course. Mathematical truths 
are in fact the prototype of the completely incontestable. Not for nothing 
do people say “as clear as two and two are four.” Here the relation “two 
and two are four” is introduced as the very image of the irréfutable 
and incontestable. 

But the rigor of mathematics is not absolute; it is in a process of con¬ 
tinuai development; the principles of mathematics hâve not congealed 
once and for ail but hâve a life of their own and may even be the subject 
of scientific quarrels. 

In the final analysis the vitality of mathematics arises from the fact that 
its concepts and results, for ail their abstractness, originate, as we shall see, 
in the actual world and find widely varied application in the other sciences, 
in engineering, and in ail the pratical affairs of daily life; to realize this is 
the most important prerequisite for understanding mathematics. 

The exceptional breadth of its applications is another characteristic 
feature of mathematics. 

In the first place we make constant use, almost every hour, in industry 
and in private and social life, of the most varied concepts and results of 
mathematics, without thinking about them at ail; for example, we use 
arithmetic to computéour expenses or geometry to calculate the floor area 
of an apartment. Of course, the rules here are very simple, but we should 
remember that in some period of antiquity they represented the most 
advanced mathematical achievements of the âge. 
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Second, modem technology would be impossible without mathematics. 
There is probably not a single technical process which can be carried 
through without more or less complicated calculations; and mathematics 
plays a very important rôle in the development of new branches of 
technology. 

Finally, it is true that every science, to a greater or lesser degree, makes 
essential use of mathematics. The “exact sciences,” mechanics, astronomy, 
physics, and to a great extent chemistry, express their laws, as every 
schoolboy knows, by means of formulas and make extensive use of mathe- 
matical apparatus in developing their théories. The progress of these 
sciences would hâve been completely impossible without mathematics. 
For this reason the requirements of mechanics, astronomy, and physics 
hâve always exercised a direct and décisive influence on the development 
of mathematics. 

In other sciences mathematics plays a smaller rôle, but here too it finds 
important applications. Of course, in the study of such complicated 
phenomena as occur in biology and sociology, the mathematical method 
cannot play the same rôle as, let us say, in physics. In ail cases, but espe- 
cially where the phenomena are most complicated, we must bear in mind, 
if we are not to lose our way in meaningless play with formulas, that the 
application of mathematics is significant only if the concrète phenomena 
hâve already been made the subject of a profound theory. Fn one way or 
another, mathematics is applied in almost every science, from mechanics 
to political economy. 

Let us recall some particularly brilliant applications of mathematics 
in the exact sciences and in technology. 

The planet Neptune, one of the most distant in the Solar System, was 
discovered in the year 1846 on the basis of mathematical calculations. 
By analyzing certain irregularities in the motion of Uranus, the astron- 
omers Adams and Leverrier came to the conclusion that these irregularities 
were caused by the gravitational attraction of another planet. Leverrier 
calculated on the basis of the laws of mechanics exactly where this planet 
must be, and an observer to whom he communicated his results caught 
sight of it in his telescope in the exact position indicated by Leverrier. 
This discovery was a triumph not only for mechanics and astronomy, 
and in particular for the System of Copernicus, but also for the powers 
of mathematical calculation. 

Another example, no less impressive, was the discovery of electro- 
magnetic waves. The English physicist Maxwell, by generalizing the laws 
of electromagnetic phenomena as established by experiment, was able to 
express these laws in the form of équations. From these équations he 
deduced, by purely mathematical methods, that electromagnetic waves 
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could exist and that they must be propagated with the speed of light. 
On the basis of this resuit, he proposed the electromagnetic theory 
of light, which was later developed and deepened in every direction. 
Moreover, Maxwell’s results led to the search for electromagnetic waves 
of purely electrical origin, arising for example from an oscillating charge. 
These waves were actually discovered by Hertz. Shortly afterwards, 
A. S. Popov, by discovering means for exciting, transmitting, and receiving 
electromagnetic oscillations made them available for a wide range of 
applications and thereby laid the foundations for the whole technology 
of radio. In the discovery of radio, now the common possession of 
everyone, an important rôle was played by the results of a purely mathe- 
matical déduction. 

So from observation, as for example of the deflection of a magnetic 
needle by an electric current, science proceeds to generalization, to a theory 
of the phenomena, and to formulation of laws and to mathematical 
expression of them. From these laws corne new déductions, and finally, 
the theory is embodied in practice, which in its turn provides powerful 
new impulses for the development of the theory. 

It is particularly remarkable that even the most abstract constructions 
of mathematics, arising within that science itself, without any immédiate 
motivation from the natural sciences or from technology, ncvertheless 
hâve fruitful applications. For example, imaginary numbers first came to 
light in algebra, and for a long time their significance in the actual world 
remained uncomprehended, a circumstance indicated by their very name. 
But when about 1800 a geometrical interprétation (see Chapter IV, §2) 
was given to them, imaginary numbers became firmly established in 
mathematics, giving rise to the extensive theory of functions of a complex 
variable, i.e., of a variable of the form x + y V^ï- This theory of 
“imaginary” functions of an “imaginary” variable proved itself to be far 
from imaginary, but rather a very practical means of solving technological 
problems. Thus, the fundamental results of N. E. Jukovski concerning 
the lift on the wing of an airplane are proved by means of this theory. 
The same theory is useful, for example, in the solution of problems 
concerning the oozing of water under a dam, problems whose importance 
is obvious during the présent period of construction of huge hydroelectric 
stations. 

Another example, equally impressive, is provided by non-Euclidean 
geometry,* which arose from the efforts, extending for 2000 years from 
the time of Euclid, to prove the parallel axiom, a problem of purely 


* Here we merely point out this example without further explanation, for which the 
reader may turn to Chapter XVH. 
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mathematical interest. N. I. Lobacevskil himself, the founder of the 
new geometry, was careful to label his geometry “imaginary,” since he 
could not see any meaning for it in the actual world, although he was 
confident that such a meaning would eventually be found. The results of 
his geometry appeared to the majority of mathematicians to be not only 
“imaginary” but even unimaginable and absurd. Nevertheless, his ideas 
laid the foundation for a new development of geometry, namely the 
création of théories of various non-Euclidean spaces; and these ideas 
subsequently became the basis of the general theory of relativity, in which 
the mathematical apparatus consists of a form of non-Euclidean geometry 
of four-dlmensional space. Thus the abstract constructions of mathematics, 
which at the very least seemed incompréhensible, proved themselves a 
powerful instrument for the development of one of the most important 
théories of physics. Similarly, in the present-day theory of atomic phenom- 
ena, in the so-called quantum mechanics, essential use is made of many 
extremely abstract mathematical concepts and théories, as for example 
the concept of infinite-dimensional space. 

There is no need to give any further examples, since we hâve already 
shown with sufficient emphasis that mathematics finds widespread applica¬ 
tion in everyday life and in technology and science; in the exact sciences 
and in the great problems of technology, applications are found even for 
those théories which arise within mathematics itself. This is one of the 
characteristic peculiarities of mathematics, along with its abstractness 
and the rigor and conclusiveness of its results. 

2. The essential nature of mathematics. In discussing these spécial 
features of mathematics we hâve been far from explaining its essence; 
rather we hâve merely pointed out its external marks. Our task now is to 
explain the essential nature of these characteristic features. For this 
purpose it will be necessary to answer, at the very least, the following 
questions: 

What do these abstract mathematical concepts reflect ? In other words, 
what is the actual subject matter of mathematics ? 

Why do the abstract results of mathematics appear so convincing, and 
its initial concepts so obvious? In other words, on what foundation do 
the methods of mathematics rest ? 

Why, in spite of ail its abstractness, does mathematics find such wide 
application and does not tum out to be merely idle play with abstractions ? 
In other words, how is the significance of mathematics to be explained? 

Finally, what forces lead to the further development of mathematics, 
allowing it to unité abstractness with breadth of application ? What is the 
basis for its continuing growth ? 
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In answering these questions we will form a general picture of the content 
of mathematics, of its methods, and of its significance and its development; 
that is, we will understand its essence. 

Idealists and metaphysicists not only fall into confusion in their attempts 
to answer these basic questions but they go so far as to distort mathe¬ 
matics completely, turning it literaiiy inside out. Thus, seeing the extreme 
abstractness and cogency of mathematical results, the idealist imagines 
that mathematics issues from pure thought. 

In reality, mathematics offers not the slightest support for idealism or 
metaphysics. We will convince ourselves of this as we attempt, in general 
outline, to answer the listed questions about the essence of mathematics. 
For a preliminary clarification of these questions, it is sufficient to examine 
the foundations of arithmetic and elementary geometry, to which we now 
turn. 

§2. Arithmetic 

1. The concept of a whole number. The concept of number (for the 
time being, we speak only of whole positive numbers), though it is so 
familiar to us today, was worked out very slowly. This can be seen from 
the way in which counting has been done by various races who until 
recent times hâve remained at a relatively primitive level of social life. 
Among some of them, there were no names for numbers higher than two 
or three; among others, counting went further but ended after a few 
numbers, after which they simply said “many” or “countless.” A stock of 
clearly distinguished names for numbers was only gradually accumulated 
among the various peoples. 

At first these peoples had no concept of what a number is, although they 
could in their own fashion make judgments about the size of one or another 
collection of objects met with in their daily life. We must conclude that a 
number was directly perceived by them as an inséparable property of a 
collection of objects, a property which they did not, however, clearly 
distinguish. We are so accustomed to counting that we can hardly imagine 
this State of affairs, but it is possible to understand it.* 

At the next higher level a number already appears as a property of a 


* In fact, every collection of objects, whether it be a flock of sheep or a pile of 
firewood, exists and is immediately perceived in ail its concreteness and complexity. 
The distinguishing in it of separate properties and relationships is the resuit of conscious 
analysis. Primitive thought does not yet make this analysis, but considers the object 
only as a whole. Similarly, a man who has not studied music perceives a musical 
composition without distinguishing in it the details of melody, tonality, and so forth, 
while at the same time a musician easily analyzes even a complicated symphony. 



8 


I. A GENERAL VIEW O F MATHEMAT1CS 


collection of objects, but it is not yet distinguished from the collection as 
as an “abstract number,” as a number in general, not connected with 
concrète objects. This is obvious from the names of numbers among 
certain peoples, as “hand” for five and “wholeman” for twenty. Here five 
is to be understood not abstractly but simply in the sense of “as many as 
the fingers on a hand,” twenty is “as many as the fingers and toes on a 
man” and so forth. In a completely analogous way, certain peoples had 
no concept of “black,” “hard,” or “circular.” In order to say that an 
object is black, they compared it with a crow for example, and to say that 
there were five objects, they directly compared these objects with a hand. 
In this way it also came about that various names for numbers were used 
for various kinds of objects; some numbers for counting people, others 
for counting boats, and so forth, up to as many as ten different kinds of 
numbers. Here we do not hâve abstract numbers, but merely a sort of 
“appellation,” referring only to a definite kind of objects. Among other 
peoples there were in general no separate names for numbers, as for 
example, no word for “three,” although they could say “three men” or 
“in three places,” and so forth. 

Similarly among ourselves, we quite readily say that this or that object 
is black but much more rarely speak about “blackness” in itself, which is 
a more abstract concept.* 

The number of objects in a given collection is a property of the col¬ 
lection, but the number itself, as such, the “abstract number,” is a property 
abstracted from the concrète collection and considered simply in itself, 
like “blackness” or “hardness.” Just as blackness is the property common 
to ail objects of the color of coal, so the number “five” is the common 
property of ail collections containing as many objects as there are fingers 
on a hand. In this case the equality of the two numbers is established by 
simple comparison: We take an object from the collection, bend one finger 
over, and count in this way up to the end of the collection. More generally, 
by pairing off the objects of two collections, it is possible, without making 
any use of numbers at ail, to establish whether or not the collections 
contain the same number of objects. For example, if guests are taking their 
places at the table they can easily, without any counting, make it clear to 
the hostess that she has forgotten one setting, since one guest will be 
without a setting. 

* In the formation of concepts about properties of objects, such as color or the 
numerosity of a collection, it is possible to distinguish three steps, which we must not, 
of course, try to separate too sharply from one another. At the first step the property 
is defined by direct comparison of objects; like a crow, as many as on a hand. At the 
second, an adjective appears: a black stone or (the numerical adjective being quite 
analogous) five trees. At the third step the property is abstracted from the objects 
and may appear “as such”; for example “blackness,” or the abstract number "five." 
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In this way it is possible to give the following définition of a number: 
Each separate number like “two,” “five,” and so forth, is that property 
of collections of objects which is common to ail collections whose 
objects can be put into one-to-one correspondence with one another 
and which is different for those collections for which such a cor¬ 
respondence is impossible. In order to discover this property and 
to distinguish it clearly, that is, in order to form the concept of 
a definite number and to give it a name “six,” “ten,” and so forth, 
it was necessary to compare many collections of objects. For 
countless générations people repeated the same operation millions of 
times and in that way discovered numbers and the relations among 
them. 

2. Relations among the whole numbers. Operations with numbers 
arose in their turn as a reflection of relations among concrète objects. 
This is observable even in the names of numbers. For example, among 
certain American Indians the number “twenty-six” is pronounced as 
“above two tens I place a six,” which is clearly a reflection of a concrète 
method of counting objects. Addition of numbers corresponds to placing 
together or uniting two or more collections, and it is equally easy to see 
the concrète meaning of subtraction, multiplication, and division. 
Multiplication in particular arose to a great extern, it seems clear, from 
the habit of counting off equal collections: that is, by twos, by threes, and 
so forth. 

In the process of counting, men not only discovered and assimilated the 
relations among the separate numbers, as for example that two and three 
are five, but also they gradually established certain general laws. By prac- 
tical expérience, it was discovered that a sum does not dépend on the order 
of the summands and that the resuit of counting a given set of objects 
does not dépend on the order in which the counting takes place, a fact 
which is reflected in the essential identity of the “ordinal” and “cardinal” 
numbers: first, second, third, and one, two, three. In this way the numbers 
appeared not as separate and independent, but as interrelated with one 
another. 

Some numbers are expressed in terms of others in their very names and 
in the way they are written. Thus, “twenty” dénotés “two (times) ten”; 
in French, eighty is “four-twenties” (quatre-vingt), ninety is “four- 
twenties-ten”; and the Roman numerals VIII, IX dénoté that 8 = 5 + 3, 
9 = 10 - 1. 

In general, there arose not just the separate numbers but a system of 
numbers with mutual relations and rules. 

The subject matter of arithmetic is exactly this, the system of numbers 
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with its mutual relations and rules.* The separate abstract number by itself 
does not hâve tangible properties, and in general there is very little to be 
said about it. If we ask ourselves, for example, about the properties of 
the number six, we note that 6 = 5 + 1,6 = 3 • 2, that 6 is a factor of 30 
and so forth. But here the number 6 is always connected with other num- 
bers; in fact, the properties of a given number consist precisely of its 
relations with other numbers.* Consequently, it is clear that every arith- 
metical operation détermines a connection or relation among numbers. 
Thus the subject matter of arithmetic is relations among numbers. But 
these relations are the abstract images of actual quantitative relations 
among collections of objects; so we may say that arithmetic is the science 
of actual quantitative relations considered abstractly, that is, purely as 
relations. Arithmetic, as we see, did not arise from pure thought, as the 
idealists represent, but is the reflection of definite properties of real things; 
it arose from the long practical expérience of many générations. 

3. Symbols for the numbers. As social life became more extensive 
and complicated, it posed broader problems. Not only was it necessary 
to take note of the number of objects in a set and to tell others about it, 
a necessity which had already led to formulation of the concept of number 
and to names for the numbers, but it became essential to learn to count 
increasingly larger collections, of animais in a herd, of objects for exchange, 
of days before a fixed date, and so forth, and to communicate the results 
of the count to others. This situation absolutely demanded improvement 
in the names and also in the symbols for numbers. 

The introduction of symbols for the numbers, which apparently occured 
as soon as writing began, played a great rôle in the development of 
arithmetic. Moreover, it was the first step toward mathematical signs and 
formulas in general. The second step, consisting of the introduction of 
signs for arithmetical operations and of a literal désignation for the 
unknown (x), was taken considerably later. 

The concept of number, like every other abstract concept, has no 
immédiate image; it cannot be exhibited but can only be conceived in the 


* The word “arithmetic,” meaning the “art of calculation," is derived from the 
Greek adjective “arithmetic” formed from the noun "arithmos," meaning “number.” 
The adjective modifies a noun “techne” (art. technique), which is here understood. 

t This is understandable from the most general considérations. An arbitrary 
abstraction, removed from its concrète basis (just as a number is abstracted from a 
concrète collection of objects), has no sense “in itself"; it exists only in its relations 
with other concepts. These relations are already implicit in any statement about the 
abstraction, in the most incomplète définition of it. Without them the abstraction 
lacks content and significance, i.e., it simply does not exist. The content of the concept 
of an abstract number lies in the rules, in the mutual relations of the System of numbers. 
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mind. But thought is formulated in Ianguage, so that without a name there 
can be no concept. The Symbol is also a name, except that it is not oral 
but written and présents itself to the mind in the form of a visible image. 
For example, if 1 say “seven,” what do you picture to yourself ? Probably 
not a set of seven objects of one kind or another, but rather the symbol 
“7,” which forms a sort of tangible framework for the abstract number 
"seven.” Moreover, a number 18273 is considerably harder to pronounce 
than to write and cannot be pictured with any accuracy in the form of 
a set of objects. In this way it came about, though only after some lapse of 
time, that the symbols gave rise to the conception of numbers so large that 
they could never hâve been discovered by direct observation or by 
énumération. With the appearance of govemment, it was necessary to 
collect taxes, to assemble and outfit an army, and so forth, ail of which 
required operations with very large numbers. 

Thus the importance of symbols for the numbers consists, in the first 
place, in their providing a simple embodiment of the concept of an abstract 
number.* This is the rôle of mathematical désignations in general: They 
provide an embodiment of abstract mathematical concepts. Thus 
+ dénotés addition, x dénotés an unknown number, a an arbitrary given 
number, and so forth. In the second place the symbols for numbers provide 
a particularly simple means of carrying out operations on them. Everyone 
knows how much easier it is "to calculate on paper” than “in one’s head.” 
Mathematical signs and formulas hâve this advantage in general: They 
allow us to replace a part of our arguments with calculations, with 
something that is almost mechanical. Moreover, if a calculation is written 
down, it already possesses a definite authenticity; everything is visible, 
everything can be checked, and everything is defined by exact rules. As 
examples one might mention addition by individual columns or any 
algebraic transformation such as “taking over to the other side of the 
équation with change of sign.” From what has been said, it is clear that 
without suitable symbols for the numbers arithmetic could not hâve made 
much progress. Even more is it true that contemporary mathematics would 
be impossible without its spécial signs and formulas. 

It is obvious that the extremely convenient method of writing numbers 
that is in use today could not hâve been worked out ail at once. From 
ancient times there appeared among various peoples, from the very 

* It is worth remarking that the concept of number, which was worked out with 
such difficulty in a long period of time, is mastered nowadays by a child with relative 
easc. Why ? The first reason is, of course, that the child hears and sees adults constantly 
making use of numbers, and they even teach him to do the same. But a second reason, 
and this is the one to which we wish to draw spécial attention, is that the child already 
has at hand words and symbols for the numbers. He first learns these extemal symbols 
for number and only later masters the meaning of them. 
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beginnings of their culture, various symbols for the numbers, which were 
very unlike our contemporary ones not only in their general appearance 
but also in the principles on which they were chosen. For example, the 
décimal System was not used everywhere, and among the ancient 
Babylonians there was a System that was partly décimal and partly 
sexagésimal. Table 1 gives some of the symbols for numbers among 
various peoples. In particular, we see that the ancient Greeks, and 
later also the Russians, made use of letters to designate numbers. Our 
contemporary "Arabie” symbols and, more generally, our method of 
forming the numbers, were brought from India to Europe by the Arabs 
in the lOth century and became firmly rooted there in the course of the 
next few centuries. 

The first peculiarity of our System is that it is a décimal System. But this 
is not a matter of great importance, since it would hâve been quite possible 
to use, for example, a duodécimal System by introducing spécial symbols 
for ten and eleven. The most important peculiarity of our System of 
designating numbers is that it is "positional”; that is, that one and the 
same number has a different significance depending upon its position. 
For example, in 372 the number 3 dénotés the number of hundreds and 
7 the numbers of tens. This method of writing is not only concise and 
simple but makes calculations very easy. The Roman numerals were in 
this respect much less convenient, the same number 372 being written in 
the form CCCLXXFI; it is a very laborious task to multiply together two 
large numbers written in Roman numerals. 

Positional writing of numbers demands that in one way or another we 
take note of any category of numbers that has been omitted, since if we 
do not do this, we will confuse, for example, thirty-one with three-hundred- 
and-one. In the position of the omitted category we must place a zéro, 
thereby distinguishing 301 and 31. In a rudimentary form, zéro already 
appears in the late Babylonian cuneiform writings, but its systematic 
introduction was an achievement of the Indians:* It allowed them to 
proceed to a completely positional System of writing just as we hâve it 
today. 

But in this way zéro also became a number and entered into the System 
of numbers. By itself zéro is nothing; in the Sanskrit language of ancient 
India, it is called exactly that: "empty” (çïïnga); but in connection with 
other numbers, zéro acquires content, and well-known properties; for 
example, an arbitrary number plus zéro is the same number, or when an 
arbitrary number is multiplied by zéro it becomes zéro. 

* The first Indian manuscript in which zéro appears cornes from the end of the 
9th century; in it the number 270 is written exactly as we would write it today. But 
it is probable that zéro was introduced in India still earlier, in the 6th century. 
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4. The theory of numbers as a branch of pure mathematics. Let us 

return to the arithmetic of the ancients. The oldest texts that hâve been 
preserved from Babylon and Egypt go back to the second millennium B.C. 
These and later texts contain various arithmetical problems with their 
solutions, among them certain ones that today belong to algebra, such as 
the solution of quadratic and even cubic équations or progressions; ail 
this being presented, of course, in the form of concrète problems and 
numerical examples. Among the Babylonians we also find certain tables of 
squares, cubes, and reciprocals. It is to be supposed that they werealready 
beginning to form mathematical interests which were not immediately 
connected with practical problems. 

In any case arithmetic was well developed in ancient Babylon and Egypt. 
However, it was not yet a mathematical theory of numbers but rather a 
collection of solutions for various problems and of rules of calculation. 
It is exactly in this way that arithmetic is taught up to the présent time in 
our elementary schools and is understood by everyone who is not 
especially interested in mathematics. This is perfectly legitimate, but 
arithmetic in this form is still not a mathematical theory. There are no 
general theorems about numbers. 

The transition to theoretical arithmetic proceeded gradually. 

As was pointed out, the existence of symbols allows us to operate with 
numbers so large that it is impossible to visualize them as collections of 
objects or to arrive at them by the process of counting in succession from 
the number one. Among primitive tribes spécial numbers were worked 
out up to 3, 10, 100 and so forth, but after these came the indefinite 
“many.” In contrast to this situation the use of symbols for numbers 
enabled the Chinese, the Babylonians, and the Egyptians to proceed to 
tens of thousands and even to millions. It was at this stage that the pos- 
sibility was noticed of indefinitely extending the sériés of numbers, 
although we do not know how soon this possibility was clearly perceived. 
Even Archimedes (287-212 B.C.) in his remarkable essay “The Sand 
Reckoner” took the trouble to describe a method for naming a number 
greater than the number of grains of sand sufficient to fill up the “sphere 
of the fixed stars.” So the possibility of naming and writing such a number 
still required at his time a detailed explanation. 

By the 3rd century B.C., the Greeks had clearly recognized twoimportant 
ideas: first, that the sequence of numbers could be indefinitely extended 
and second, that it was not only possible to operate with arbitrarily given 
numbers but to discuss numbers in general, to formulate and prove general 
theorems about them. This idea represents the generalization of an 
immense amount of earlier expérience with concrète numbers, from which 
arose the rules and methods for general reasoning about numbers. A 
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transition look place to a higher level of abstraction: from separate given 
(though abstract) numbers to number in general, to any possible number. 

From the simple process of counting objects one by one, we pass to 
the unbounded process of formation of numbers by adding one to the 
number already formed. The sequence of numbers is regarded as being 
indefinitely continuable, and with it there enters into mathematics the 
notion of infinity. Of course, we cannot in fact, by the process of adding 
one, proceed arbitrarily far along the sequence of numbers: Who could 
reach as far as a million-million, which is almost forty times the number of 
seconds in a thousand years? But that is not the point; the process of 
adding ones, the process of forming arbitrary large collections of objects 
is in principle unlimited, so that the possibility exists of continuing the 
sequence of numbers beyond ail limits. The fact that in actual practice 
counting is limited is not relevant; an abstraction is made from it. It is 
with this indefinitely prolonged sequence that general theorems about 
numbers hâve to deal. 

General theorems about any property of an arbitrary number already 
contain in implicit form infinitely many assertions about the properties of 
separate numbers and are therefore qualitatively richer than any particular 
assertions that could be verified for spécifie numbers. It is for this reason 
this general theorems must be proved by general arguments proceeding 
from the fundamental rule for the formation of the sequence of numbers. 
Here we perceive a profound peculiarity of mathematics: Mathematics 
takes as its subject not only given quantitative relationships but ail possible 
quantitative relationships and therefore infinity. 

In the famous “Eléments” of Euclid, written in the 3rd century B.C., 
we already find general theorems about whole numbers, in particular, the 
theorem that there exist arbitrarily large prime numbers.* 

Thus arithmetic is transformed into the theory of numbers. It is already 
removed from particular concrète problems to the région of abstract 
concepts and arguments. It has become a part of “pure” mathematics. 
More precisely, this was the moment of the birth of pure mathematics 
itself with the characteristic features discussed in ourfirst section. We must, 
of course, take note of the fact that pure mathematics was born simul- 
taneously from arithmetic and geometry and that there were already to 
be found in the general rules of arithmetic some of the rudiments of 
algebra, a subject which was separated from arithmetic at a later stage. 
But we will discuss this later. 

It remains now to summarize our conclusions up to this point, since we 


* We recall that a prime number is defined as a positive integer greater than unity 
which is divisible without remainder only by the number itself and by unity. 
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hâve now traced out, though in very hurried fashion, the process whereby 
theoretical arithmetic arose from the concept of number. 

5. The essential nature of arithmetic. Since the birth of theoretical 
arithmetic is part of the birth of mathematics, we may reasonably expect 
that our conclusions about arithmetic will throw light on our earlier 
questions concerning mathematics in general. Let us recall these questions, 
particularly in their application to arithmetic. 

1. How did the abstract concepts of arithmetic arise and what do they 
reflect in the actual world ? 

This question is answered by the earlier remarks about the birth of 
arithmetic. Its concepts correspond to the quantitative relations of 
collections of objects. These concepts arose by way of abstraction, as a 
resuit of the analysis and generalization of an immense amount of practical 
expérience. They arose gradually; first came numbers connected with 
concrète objects, then abstract numbers, and finally the concept of number 
in general, of any possible number. Each of these concepts was made 
possible by a combination of practical expérience and preceding abstract 
concepts. This, by the way, is one of the fundamental laws of formation 
of mathematical concepts: They are brought into being by a sériés of 
successive abstractions and generalizations, each resting on a combination 
of expérience with preceding abstract concepts. The history of the concepts 
of arithmetic shows how mistaken is the idealistic view that they arose 
from “pure thought,” from “innate intuition,” from “contemplation of a 
priori forms,” or the like. 

2. Why are the conclusions of arithmetic so convincing and unalterable ? 

History answers this question too for us. We see that the conclusions 

of arithmetic hâve been worked out slowly and gradually; they reflect 
expérience accumulated in the course of unimaginably many générations 
and hâve in this way fixed themselves firmly in the mind of man. They 
hâve also fixed themselves in language: in the names for the numbers, 
in their symbols, in the constant répétition of the same operations with 
numbers, in their constant application to daily life. It is in this way that 
they hâve gained clarity and certainty. The methods of logical reasoning 
also hâve the same source. What is essential here is not only the fact that 
they can be repeated at will but their soundness and perspicuity, which 
they possess in common with the relations among things in the actual 
world, relations which are reflected in the concepts of arithmetic and in 
the rules for logical déduction. 

This is the reason why the results of arithmetic are so convincing; its 
conclusions flow logically from its basic concepts, and both of them, the 
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methods of logic and the concepts of arithmetic, were worked out and 
firmly fixed in our consciousness by three thousand years of practical 
expérience, on the basis of objective uniformities in the world around us. 

3. Why does arithmetic hâve such wide application in spite of the 
abstractness of its concepts? 

The answer is simple. The concepts and conclusions of arithmetic, which 
generalize an enormous amount of expérience, reflect in abstract form 
those relationships in the actual world that are met with constantly and 
everywhere. It is possible to count the objects in a room, the stars, people, 
atoms, and so forth. Arithmetic considéra certain of their general properties, 
in abstraction from everything particular and concrète, and it is precisely 
because it considéra only these general properties that its conclusions are 
applicable to so many cases. The possibility of wide application is guaran- 
teed by the very abstractness of arithmetic, although it is important here 
that this abstraction is not an empty one but is derived from long practical 
expérience. The same is true for ali mathematics, and for any abstract 
concept or theory. The possibilities for application of a theory dépend 
on the breadth of the original material which it generalizes. 

At the same time every abstract concept, in particular the concept of 
number, is limited in its significance as a resuit of its very abstractness. 
In the first place, when applied to any concrète object it reflects only one 
aspect of the object and therefore gives only an incomplète picture of it. 
How often it happens, for example, that the mer numerical facts say 
very little about the essence of the matter. In the second place, abstract 
concepts cannot be applied everywhere without certain limiting conditions; 
it is impossible to apply arithmetic to concrète problems without first 
convincing ouraelves that their application makes some sense in the 
particular case. If we speak of addition, for example, and merely unité 
the objects in thought, then naturally no progress has been made with 
the objects themselves. But if we apply addition to the actual uniting of 
the objects, if we in fact put the objects together, for example by throwing 
them into a pile or setting them on a table, in this case there takes place 
not merely abstract addition but also an actual process. This process does 
not consist merely of the arithmetical addition, and in general it may 
even be impossible to carry it out. For example, the object thrown into a 
pile may break; wild animais, if placed together, may tear one another 
apart; the materials put together may enter into a Chemical reaction: a 
liter of water and a liter of alcohol when poured together produced not 2, 
but 1.9 litera of the mixture as a resuit of partial solution of the liquids; 
and so forth. 

If other examples are needed they are easy to produce. 

To put it briefly, truth is concrète; and it is particularly important to 
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remember this fact with respect to mathematics, exactly because of its 
abstractness. 

4. Finally, the last question we raised had to do with the forces that led 
to the development of mathematics. 

For arithmetic the answer to this question also is clear from its history. 
We saw how people in the actual world learned to count and to work out 
the concept of number, and how practical life, by posing more difficult 
problems, necessitated symbols for the numbers. In a word, the forces 
that led to the development of arithmetic were the practical needs of social 
life. These practical needs and the abstract thought arising from them 
exercise on each other a constant interaction. The abstract concepts 
provide in themselves a valuable tool for practical life and are constantly 
improved by their very application. Abstraction from ail nonessentials 
uncovers the kernel of the matter and guarantees success in those cases 
where a décisive rôle is played by the properties and relations picked out 
and preserved by the abstraction; namely, in the case of arithmetic, by 
the quantitative relations. 

Moreover, abstract reflection often goes farther than the immédiate 
demands of a practical problem. Thus the concept of such large numbers 
as a million or a billion arose on the basis of practical calculations but 
arose earlier than the practical need to make use of them. There are many 
such examples in the history of science; it is enough to recall the imaginary 
numbers mentioned earlier. This is just a particular case of a phenomenon 
known to everyone, namely the interaction of expérience and abstract 
thought, of practice and theory. 

§3. Geometry 

1. The concept of a géométrie figure. The history of the origin of 
geometry is essentially similar to that of arithmetic. The earliest géométrie 
concepts and information also go back to prehistoric times and also 
resuit from practical activity. 

Early man took over géométrie forms from nature. The circle and the 
crescent of the moon, the smooth surface of a lake, the straightness of a 
ray of light or of a well-proportioned tree existed long before man himself 
and presented themselves constantly to his observation. But in nature itself 
our eyes seldom mcet with really straight Unes, with précisé triangles or 
squares, and it is clear that the chief reason why men gradually worked 
out a conception of these figures is that their observation of nature was an 
active one, in the sense that, to meet their practical needs, they manu- 
factured objects more and more regular in shape. They built dwellings, 
eut stones, enclosed plots of land, stretched bowstrings in their bows. 
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modeled their clay pottery, brought it to perfection and correspondingly 
formed the notion that a pot is curved, but a stretched bowstring is straight. 
In short, they first gave form to their material and only then recognized 
form as that which is impressed on material and can therefore be con- 
sidered in itself, as an abstraction from material. By recognizing the form 
of bodies, man was able to improve his handiwork and thereby to work 
out still more precisely the abstract notion of form. Thus practical activity 
served as a basis for the abstract concepts of geometry. Ft was necessary 
to manufacture thousands of objects with straight edges, to stretch 
thousands of threads, to draw upon the ground a large number of straight 
lines, before men could form a clear notion of the straight line in general, 
as that quality which is common to ail these particular cases. Nowadays 
we learn early in life to draw a straight line, since we are surrounded by 
objects with straight edges that are the resuit of manufacture, and it is 
only for this reason that in our childhood we already form a clear notion 
of the straight line. In exactly the same way the notion of géométrie 
magnitudes, of length, area, and volume, arose from practical activity. 
People measured lengths, determined distances, estimated by eye the 
area of surfaces and the volumes of bodies, ail for their practical purposes. 
It was in this way that the simplest general laws were discovered, the 
first géométrie relations: for example, that the area of a rectangle is equal 
to the product of the lengths of its sides. It is useful for a farmer to be 
aware of such a relation, in order that he may estimate the area he has 
sowed and consequently the harvest he may expect. 

So we see that geometry took its rise from practical activity and from 
the problems of daily life. On this question the ancient Greek scholar, 
Eudemus of Rhodes, wrote as follows: "Geometry was discovered by the 
Egyptians as a resuit of their measurement of land. This measurement was 
necessary for them because of the inundations of the Nile, which constantly 
washed away their boundaries.* There is nothing remarkable in the fact 
that this science, like the others, arose from the practical needs of men. 
Ail knowledge that arises from imperfect circumstances tends to perfect 
itself. It arises from sense impressions but gradually becomes an object 
of our contemplation and finally enters the realm of the intellect.” 

Of course, the measurement of land was not the only problem that led 
the ancients toward geometry. From the fragmentary texts that hâve 
survived, it is possible to form some idea of various problems of the ancient 
Egyptians and Babylonians and of their methods for solving them. 
One of the oldest Egyptian texts goes back to 1700 B.C. This is a manual 

* What is meant here is the boundaries between shares of land. Let us note, parenthet- 
ically, that geometry means land-measurement (in ancient Greek “ge” is land, and 
“metron” is measure). 
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of instruction for "secretaries” (royal officers), written by a certain Ahmes. 
It contains a collection of problems on calculating the capacity of 
containers and warehouses, the area of shares of land, the dimensions of 
earthworks, and so forth. 

The Egyptians and Babylonians were able to détermine the simplest 
areas and volumes, they knew with considérable exactness the ratio of 
the circumference to the diameter of a circle, and perhaps they were even 
able to calculate the surface area of a sphere; in a word, they already 
possessed a considérable store of geometrical knowledge. But so far as 
we can tell, they were still not in possession of geometry as a theoretical 
science with theorems and proofs. Like the arithmetic of the time, geometry 
was basically a collection of rules deduced from expérience. Moreover, 
geometry was in general not distinguished from arithmetic. Géométrie 
problems were at the same time problems for calculation in arithmetic. 

In the 7th century B.C., geometry passed from Egypt to Greece, where 
it was further developed by the great materialist philosopher, Thaïes, 
Democritus, and others. A considérable contribution to geometry was 
also made by the successors of Pythagoras, the founders of an idealistic 
religiophilosophical school. 

The development of geometry took the direction of compiling new facts 
and clarifying their relations with one another. These relations were 
gradually transformed into logical déductions of certain propositions of 
geometry from certain others. This had two results: first, the concept of 
a geometrical theorem and its proof; and second, the clarification of those 
fundamental propositions from which the others may be deduced, namely, 
the axioms. 

In this way geometry gradually developed into a mathematical 
theory. 

It is well known that systematic expositions of geometry appeared in 
Greece as far back as the 5th century B.C., but they hâve not been 
preserved, for the obvious reason that they were ail supplanted by the 
“Eléments” of Euclid (3rd century B.C.). In this work, geometry was 
presented as such a well-formed system that nothing essential was added 
to its foundations until the time of N. I. Lobaôevskil, more than two 
thousand years later. The well-known school text of Kiselev, like school 
books over the whole world, represented in its older éditions, nothing 
but a popular reworking of Euclid. Very few other books in the world 
hâve had such a long life as the “Eléments” of Euclid, this perfect création 
of Greek genius. Of course, mathematics continued to advance, and our 
understanding of the foundations of geometry has been considerably 
deepened; nevertheless the “Eléments” of Euclid became, and to a great 
extern remain, the model of a book on pure mathematics. Bringing together 
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the accomplishments of his predecessors, Euclid presented the mathematics 
of his time as an independent theoretical science; that is, he presented it 
essentially as it is understood today. 

2. The essential nature of geometry. The history of geometry leads to 
the same conclusions as that of arithmetic. We see that geometry arose 
from practical life and that its transformation to a mathematical theory 
required an immense period of time. 

Geometry opérâtes with “géométrie bodies” and figures; it studies their 
mutual relations from the point of view of magnitude and position. But a 
géométrie body is nothing other than an actual body considered solely 
from the point of view of its spatial form,* in abstraction from ail its 
other properties such as density, color, or weight. A géométrie figure is a 
still more general concept, since in this case it is possible to abstract from 
spatial extension also; thus a surface has only two dimensions, a line, 
only one dimension, and a point, none at ail. A point is the abstract 
concept of the end of a line, of a position defined to the limit of précision 
so that it no longer has any parts. It is in this way that ail these concepts 
are defined by Euclid. 

Thus geometry has as its object the spatial forms and relations of actual 
bodies, removed from their other properties and considered from the 
purely abstract point of view. It is just this high level of abstraction that 
distinguishes geometry from the other sciences that also investigate the 
spatial forms and relations of bodies. In astronomy for example, the 
mutual positions of bodies are studied, but they are the actual bodies 
of the sky; in geodesy it is the form of the earth that is studied, in crystal- 
lography, the form of crystals, and so forth. In ail these other sciences, the 
form and the position of concrète bodies are studied in their dependence 
on other properties of the bodies. 

This abstraction necessarily leads to the purely theoretical method of 
geometry; it is no longer possible to set up experiments with breadthless 
straight fines, with “pure forms.” The only possibility is to make use of 
logical argument, deriving some conclusions from others. A geometrical 
theorem must be proved by reasoning, otherwise it does not belong to 
geometry; it does not deal with “pure forms.” 

The self-evidence of the basic concepts of geometry, the methods of 
reasoning and the certainty of their conclusions, ail hâve the same source 
as in arithmetic. The properties of géométrie concepts, like the concepts 
themselves, hâve been abstracted from the world around us. It was 
necessary for people to draw innumerable straight fines before they could 
take it as an axiom that through every two points it is possible to draw a 

* By form we mean also dimensions. 
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straight line; they had to move various bodies about and apply them to 
one another on countless occasions before they could generalize their 
expérience to the notion of superposition of géométrie figures and make 
use of this notion for the proof of theorems, as is done is the well-known 
theorems about congruence of triangles. 

Finally, we must emphasize the generality of geometry. The volume of 
a sphere is equal to 4/37 tR 3 quite independently of whether we are speaking 
of a spherical vessel, of a Steel sphere, of a star, or of a drop of water. 
Geometry can abstract what is common to ail bodies, because every actual 
body does hâve more or less definite form, dimensions, and position with 
respect to other bodies. So it is no cause for wonder that geometry finds 
application almost as widely as arithmetic. Workmen measuring the 
dimensions of a building or reading a blueprint, an artillery man deter- 
mining the distance to his target, a farmer measuring the area of his field, 
an engineer estimating the volume of earthworks, ail these people make 
use of the éléments of geometry. The pilot, the astronomer, the surveyor, t he 
engineer, the physicist, ail hâve need of the précisé conclusions of geometry. 

A clear example of the abstract-geometrical solution of an important 
problem in physics is provided by the investigations of the well-known 
crystallographer and geometer, E. S. Fedorov. The problem he set himself 
of finding ail the possible forms of symmetry for crystals is one of the 
most fundamental in theoretical crystallography. To solve this problem, 
Fedorov made an abstraction from ail the physical properties of a crystal, 
considering it only as a regular System of géométrie bodies “in place of a 
System of concrète atoms.” Thus the problem became one of finding ail 
the forms of symmetry which could possibly exist in a System of géométrie 
bodies. This purely geometrical problem was completely solved by 
Fedorov, who found ail the possible forms of symmetry, 230 in number. 
His solution proved to be an important contribution to geometry and 
was the source of many géométrie investigations. 

In this example, as in the whole history of geometry, we detect the 
prime moving force in the development of geometry. It is the mutual 
influence of practical life and abstract thought. The problem of discovering 
possible symmetries originated in physical observation of crystals but was 
transformed into an abstract problem and so gave rise to a new mathe- 
matical theory, the theory of regular Systems, or of the so-called Fedorov 
groups.* Subsequently this theory not only found brilliant confirmation 
in the practical observation of crystals but also served as a general guide 
in the development of crystallography, giving rise to new investigations, 
both in experimental physics and in pure mathematics. 


Compare Chapter XX. 
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§4. Arithmetic and Geometry 

1. The origin of fractions in the interrelation of arithmetic and geometry. 

Up to now we hâve considered arithmetic and geometry apart from each 
other. Their mutual relation, and consequently the more general inter¬ 
relation of ail mathematical théories, has so far escaped our attention. 
Nevertheless this relation has exceptionally great significance. The inter¬ 
action of mathematical théories leads to advances in mathematics itself 
and also uncovers a rich treasure of mutual relations in the actual world 
reflected by the these théories. 

Arithmetic and geometry are not only applied to each other but they 
also serve thereby as sources for further general ideas, methods, and 
théories. Fn the final analysis, arithmetic and geometry are the two roots 
from which has grown the whole of mathematics. Their mutual influence 
goes back to the time when both of them had just corne into being. Even 
the simple measurement of a line represents a union of geometry and 
arithmetic. To measure the length of an object we apply to it a certain 
unit of lenght and calculate how many times it is possible to do this; the 
first operation (application) is géométrie, the second (calculation) is 
arithmetical. Everyone who counts off his steps along a road is already 
uniting these two operations. 

In general, the measurement of any magnitude combines calculation 
with some spécifie operation which is characteristic of this sort of 
magnitude. It is sufficient to mention measurement of a liquid in a gradu- 
ated container or measurement of an interval of time by counting the 
number of strokes of a pendulum. 

But in the process of measurement it turns out, generally speaking, that 
the chosen unit is not contained in the measured magnitude an intégral 
number of times, so that a simple calculation of the number of units is not 
sufficient. It becomcs necessary to divide up the unit of measurement in 
order to express the magnitude more accurately by parts of the unit; 
that is, no longer by whole numbers but by fractions. It was in this way 
that fractions actually arose, as is shown by an analysis of historical and 
other data. They arose from the division and comparison of continuous 
magnitudes; in other words, from measurement. The first magnitudes to 
be measured were géométrie, namely lengths, areas of fields, and volumes 
liquids or friable materials, so that in the earliest appearance of fractions 
we see the mutual action of arithmetic and geometry. This interaction 
leads to the appearance of an important new concept, namely of fractions, 
as an extension of the concept of number from whole numbers to fractional 
numbers (or as the mathematicians say, to rational numbers, expressing 
the ratio of whole numbers). Fractions did not arise, and could not arise. 
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from the division of whole numbers, since only whole objects are counted 
by whole numbers. Three men, three arrows, and so forth, ail these 
make sense, but two-thirds of a man and even two-thirds of an arrow are 
senseless concepts; even three separate thirds of an arrow will not kill a 
deer, for this it is necessary to hâve a whole arrow. 

2. Incommensurable magnitudes. In the development of the concept 
of number, arising from the mutual action of arithmetic and geometry, 
the appearance of fractions was only the first step. The next was the 
discovery of incommensurable intervals. Let us recall that intervals are 
called incommensurable if no interval exists which can be applied to each 
of them a whole number of times or, in otherwords, if their ratio can not 
be expressed by an ordinary fraction; that is, by a ratio of whole numbers. 

At first people simply did not think about the question whether every 
interval can be expressed by a fraction. If in dividing up or measuring an 
interval they came upon very small parts, they merely discarded them; 
in practice, it made no sense to speak of infinité précision of measurement. 
Democritus even advanced the notion that geometrical figures consist of 
atoms of a particular kind. This notion, which to our view seems quite 
strange, proved very fruitful in the détermination of areas and volumes. 
An area was calculated as the sum of rows consisting of atoms, and a 
volume as the sum of atomic layers. It was in this way, for example, that 
Democritus found the volume of a cône. A reader who understands the 
intégral calculus will note that this method already forms the prototype 
of the détermination of areas and volumes by the methods of the intégral 
calculus. Moreover, in returning in thought to the times of Democritus, 
one must attempt to free oneself of the customary notions of today, which 
hâve become firmly fixed in our minds by the development of mathematics. 
At the time of Democritus, geometrical figures were not yet separated 
from actual ones to the same extern as is now the case. Since Democritus 
considered actual bodies as consisting of atoms, he naturally also regarded 
geometrical figures in the same Iight. 

But the notion that intervals consist of atoms cornes into contradiction 
with the theorem of Pythagoras, since it follows from this theorem that 
incommensurable intervals exist. For example, the diagonal of a square is 
incommensurable with its side; in other words, the ratio of the two 
cannot be expressed as the ratio of whole numbers. 

We shall prove that the side and the diagonal of a square are in fact 
incommensurable. If a is the side and b is the diagonal of a square, then 
according to the theorem of Pythagoras b 2 = a 2 + a 2 = 2 a 2 and therefore 
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But there is no fraction such that its square is equal to 2. In fact, if we 
suppose that there is, let p and q be whole numbers for which 



where we may assume that p and q hâve no common factor, since otherwise 
we could simplify the fraction. But if (p/q) 2 = 2, then p 2 = 2q 2 , and 
therefore p 2 is divisible by 2. In this case p 2 is also divisible by 4, since it 
is the square of an even number. So p 2 = 4< 7 ,; that is, 2q 2 = 4 q x , and 
q 2 = 2q l . Front this it follows that q must also be divisible by 2. But this 
contradicts the supposition that p and q hâve no common factor. This 
contradiction proves that the ratio b/a cannot be expressed by a rational 
number. The diagonal and the side of a square are incommensurable. 

This discovery made a great impression on the Greek scientists. 
Nowadays, when we are accustomed to irrational numbers and calculate 
freely with square roots, the existence of incommensurable intervals does 
not disturb us. But in the 5th century B.C., the discovery of such intervals 
had a completely different aspect for the Greeks. Since they did not hâve 
the concept of an irrational number and never wrote a symbol like \/ 2 , 
the previous resuit indicated that the ratio of the diagonal and the side 
of the square was not represented by any number at ail. 

In the existence of incommensurable intervals the Greeks discovered a 
profound paradox inhérent in the concept of continuity, one of the expres¬ 
sions of the dialectical contradiction comprised in continuity and motion. 
Many important Greek philosophers considered this contradiction; 
particularly well-known among them, because of his paradoxes, is Zeno 
the Eleatic. 

The Greeks founded a theory of ratios of intervals, or of magnitudes in 
general, which takes into considération the existence of incommensurable 
intervals;* it is expounded in the “Eléments” of Euclid, and in simplified 
form is explained today in high school courses in geometry. But to 
recognize that the ratio of one interval to another (if the second interval 
is taken as the unit of length, this ratio is simply the length of the first 
interval) may also be considered as a number, whereby the very concept 
of number is generalized, to this idea the Greeks were not able to rise: 
The concept of an irrational number simply did not originate among 
them.t This step was taken at a later period by the mathematicians 


* This theory is ascribed to the Greek scientist Eudoxus, who lived in the 4th century 
B.C. 

t As a resuit of the fact that the theory of the measurement of magnitudes did not 
become part of arithmetic but passed over into geometry, mathematics among the 
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of the East; and in general, a mathematically rigorous définition of a real 
number, not depending immediately on geometry, was given only recently: 
in the seventies of the last century.* The passage of such an immense 
period of time after the founding of the theory of ratios shows how 
difficult it is to discover abstract concepts and give them exact formulation. 

3. The real number. In describing the concept of a real number, 
Newton in his “General Arithmetic” wrote: “by number we mean not so 
much a collection of units as an abstract ratio of a certain quantity to 
another quantity taken as the unit.” This number (ratio) may be intégral, 
rational, or if the given magnitude is incommensurable with the unit, 
irrational. 

A real number in its original sense is therefore nothing but the ratio of 
one magnitude to another taken as a unit; in particular cases this is a ratio 
of intervals, but it may also be a ratio of areas, weights, and so forth. 

Consequently, a real number is a ratio of magnitudes in general, 
considered in abstraction from their concrète nature. 

Just as abstract whole numbers are of mathematical interest only in 
their relations with one another, so abstract real numbers hâve content 
and become an object of mathematical attention only in relation with one 
another in the System of real numbers. 

In the theory of real numbers, just as in arithmetic, it is first necessary 
to define operations on numbers: addition, subtraction, multiplication, di¬ 
vision, and also the relations expressed by such words as “greater than” or 
“less than.” These operations and relations reflect actual connections 
among the various magnitudes; for example, addition reflects the placing 
togetherof intervals. A beginningon operations with abstract real numbers 
was made in the Middle Ages by the mathematicians of the East. Later 
came the graduai discovery of the most important property of the System 


Greeks was engulfed by geometry. Such questions, for example, as the solution of 
quadratic équations, which today we treat in an algebraic way, they stated and solved 
geometrically. The “Eléments” of Euclid contain a considérable number of such 
questions, which obviously represented for eontemporary mathematicians a summary 
of the foundations not only of geometry in our sense but of mathematics in general. 
This domination by geometry continued up to the time when Descartes, on the contrary, 
subjected geometry to algebra. Traces of the long domination by geometry are pre- 
served, for example, in such names as “square" and “cube” for the second and third 
powers: “o cubed” is a cube with side a. 

* We are speaking here not of a descriptive définition, but of a définition which 
serves as the immédiate basis for proofs of theorems about the properties of real 
numbers. It is natural that such définitions should arise at a later period, when the 
development of mathematics, and in particular of the infinitésimal analysis, required 
a suitable définition of the real number represented by “the variable x." This définition 
was given in various forms in the seventies of the last century by the German mathe¬ 
maticians Weierstrass, Dedekind, and Cantor. 
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of real numbers, its continuity. The System of real numbers is the 
abstract image of ail the possible values of a continuously varying 
magnitude. 

In this way, as in the similar case of whole numbers, the arithmetic of 
real numbers deals with the actual quantitative relations of continuous 
magnitudes, which it studies in their general form, in complété abstraction 
from ail concrète properties. It is precisely because real numbers deal with 
what is common to ail continuous magnitudes that they hâve such wide 
application: The values of various magnitudes, a length, a weight, the 
strength of an electric current, energy and so forth, are expressed by 
numbers, and the interdependence or relations among these entities are 
mirrored as relations among their numerical values. 

To show how the general concept of real numbers can serve as the basis 
of a mathematical theory, we must give their mathematical définition in 
a formai way. This may be done by various methods, but perhaps the 
most natural is to proceed from the very process of measurement of 
magnitudes which actually did lead in practical life to this generalization 
of the concept of number. We will speak about the length of intervals, 
but the reader will readily perceive that we could argue in exactly 
the same way about any other magnitudes which permit indefinite 
subdivision. 

Let us suppose that we wish to measure the interval AB by means of the 
interval CD taken as a unit (figure 1). 

A PB 


C D 

Fig. 1. 

We apply the interval CD to AB, beginning for example with the point A, 
as long as CD goes into AB. Suppose this is n 0 times. If there still remains 
from the interval AB a remainder PB, then we divide the interval CD 
into ten parts and measure the remainder with these tenths. Suppose that nj 
of the tenths go into the remainder. If after this there is still a remainder, 
we divide our measure into ten parts again; that is, we divide CD into 
a hundred parts, and repeat the same operation, and so forth. Either the 
process of measurement cornes to an end, or it continues. In either 
case we reach the resuit that in the interval AB the whole interval CD is 
contained n 0 times, the tenths are contained n Y times, the hundredths n 2 
times and so forth. In a word, we dérivé the ratio of AB to CD with 
increasing accuracy: up to tenths, to hundredths, and so forth. So the 
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ratio itself is represented by a décimal fraction with n 0 units, n, tenths and 
so forth 

AB 

— "o ' n i n 2 n 3 

This décimal fraction may be infinité, corresponding to the possibility 
of indefinite increase in the précision of measurement. 

Thus the ratio of two intervals, or of two magnitudes in general, is 
always representable by a décimal fraction, finite or infinité. But in the 
décimal fraction there is no longer any trace of the concrète magnitude 
itself; it represents exactly the abstract ratio, the real number. Thus a 
real number may be formally defined if we wish, as a finite or infinité 
décimal fraction.* 

Our définition will be complété if we say what we mean by the operations 
of addition and so forth for décimal fractions. This is done in such a way 
that the operations defined on décimal fractions correspond to the opera¬ 
tions on the magnitudes themselves. Thus, when intervals are put together 
their lengths are added; that is, the length of the interval AB + BC is 
equal to the sum of the length AB and BC. In defining the operations on 
real numbers, there is a difficulty that these numbers are represented in 
general by infinité décimal fractions, while the well-known rules for these 
operations refer to finite décimal fractions. A rigorous définition of the 
operations for infinité décimais may be made in the following way. 
Suppose, for example, that we must add the two numbers a and b. We take 
the corresponding décimal fractions up to a given décimal place, say the 
millionth, and add them. We thus obtain the sum a + b with corre- 
sponding accuracy, up to two millionths, since the errors in a and b may 
be added together. So we are able to define the sum of two numbers with 
an arbitrary degree of accuracy, and in that sense their sum is completely 
defined, although at each stage of the calculation it is known only with a 
a certain accuracy. But this corresponds to the essential nature of the case, 
since each of the magnitudes a and b is also measured only with a certain 
accuracy, and the exact value of each of the corresponding infinité fractions 
is obtained as the resuit of an indefinitely extended increase in accuracy. 
The relations “greater than” and “less than” may then be defined by 
means of addition: a is greater than b if there exists a magnitude c such 
that a — b + c, where we are speaking, of course, of positive numbers. 

The continuity of the sequence of real number finds expression in the 
fact that if the numbers a,, a 2 , ••• increase and , b 2 , ••• diminish but 


* Fractions with the periodic digit nine are not considered here, they are identical 
with the corresponding fraction without nines according to the well-known rule, 
which is clear from the example: 0.139999 ••• = 0.140000 •••. 
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always remain greater than the a ,, then between the one sériés of numbers 
and the other there is always a number c. This may be visualized on a 
straight line if its points are put into correspondence with the numbers 
(figure 2) according to the well-known rule. 

a. a, c b 2 b. 

—H-4-h— 

FlG. 2. 

Here it is clearly seen that the presence of the number c and of the point 
corresponding to it signify the absence of a break in the sériés of numbers, 
which is what is meant by their continuity. 

4. The conflict of opposites: concrète and abstract. Already in the 
example of the interaction of arithmetic and geometry we can see that the 
development of mathematics is a process of conflict among the many 
contrasting éléments: the concrète and the abstract, the particular and the 
general, the formai and the material, the finite and the infinité, the discrète 
and the continuous, and so forth. Let us try, for example, to trace the 
contrast between concrète and abstract in the formation of the concept of 
a real number. As we hâve seen, the real number reflects an infinitely 
improvable process of measurement or, in slightly different terms, an 
absolutely accurate détermination of a magnitude. This corresponds to 
the fact that in geometry we consider ideally précisé forms and dimensions 
of bodies, abstracting altogether from the mobility of concrète objects 
and from a certain indefiniteness in their actual forms and dimensions; 
for example, the interval measured (figure 1) was a completely idéal one. 

But ideally précisé géométrie forms and absolutely précisé values for 
magnitudes represent abstractions. No concrète object has absolutely 
précisé form nor can any concrète magnitude be measured with absolute 
accuracy, since it does not even hâve an absolutely accurate value. The 
length of a line segment, for example, has no sense if one tries to make it 
précisé beyond the limits of atomic dimensions. In every case when one 
passes beyond well-known limits of quantitative accuracy, there appears 
a qualitative change in the magnitude, and in general it loses its original 
meaning. For example, the pressure of a gas cannot be made précisé beyond 
the limits of the impact of a single molécule; electric charge ceases to be 
continuous when one tries to make it précisé beyond the charge on an 
électron and so forth. In view of the absence in nature of objects of ideally 
précisé form, the assertion that the ratio of the diagonal of a square to the 
side is equal to the y/1 not only cannot be deduced with absolute accuracy 
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from immédiate measurement but does not even hâve any absolutely 
accurate meaning for an actual concrète square. 

The conclusion that the diagonal and the side of a square are incom¬ 
mensurable cornes, as we hâve seen, from the theorem of Pythagoras. 
This is a theoretical conclusion based on a development of the data of 
expérience; it is a resuit of the application of logic to the original premises 
of geometry, which are taken from expérience. 

In this way the concept of incommensurable intervals, and ail the more 
of real numbers, is not a simple immédiate reflection of the facts of 
expérience but goes beyond them. This is quite understandable. The real 
number does not reflect any given concrète magnitude but rather 
magnitude in general, in abstraction from ail concreteness; in other words, 
it reflects what is common to particular concrète magnitudes. What is 
common to ail of them consists in particular in this, that the value of the 
magnitude can be determined more and more precisely; and if we abstract 
from concrète magnitudes, then the limit of this possible increase in 
précision, which dépends on the concrète nature of the magnitude, 
becomes indefinite and disappears. 

In this way a mathematical theory of magnitudes, since it considers 
magnitudes in abstraction from their individual nature, must inevitably 
consider the possibility of unlimited accuracy for the value of the magni¬ 
tude and must thereby lead to the concept of a real number. At the same 
time, since it reflects only what is common to various magnitudes, mathe- 
matics takes no account of the peculiarity of each individual magnitude. 

Since mathematics selects only general properties for considération, it 
opérâtes with its clearly defined abstractions quite independently of the 
actual limits of their applicability, as must happen precisely because these 
limits are different in different particular cases. These limits dépend on 
the concrète properties of the phenomena under considération and on the 
qualitative changes that take place in them. So in making an application of 
mathematics, it is necessary to verify the actual applicability of the theory 
in question. To consider matter as continuous and to describe its properties 
by continuous magnitudes is permissible only if we may abstract from its 
atomic structure, and this is possible only under well-known conditions. 

Nevertheless, the real numbers represent a trustworthy and powerful 
instrument for the mathematical investigation of actual continuous 
magnitudes and processes. Their theory is based on practice, on an 
immense field of applications in physics, technology, and chemistry. 
Consequently, practice shows that the concept of the real number correctly 
reflects the general properties of magnitudes. But this correctness is not 
without limits; it is not possible to consider the theory of real numbers as 
something absolute, allowing an unlimited abstract development in 
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complété séparation from reality. The very concept of the real number is 
continuing to develop and is in fact still far from being complété. 

5. The conflict of opposites: discrète and continuous. The rôle of 
another of the mentioned contrasts, the contrast between the discrète and 
the continuous, may also be illustrated by the development of the concept 
of number. We hâve already seen that fractions arose from the division 
of continuous magnitudes. 

On this theme of division there is a humorous question which is extra- 
ordinarily instructive. Grandmother has bought three potatoes and must 
divide them equally between two grandsons. How is she to do it? The 
answer is: make mashed potatoes. 

The joke reveals the very essence of the matter. Separate objects are 
indivisible in the sense that, when divided, the object almost always 
ceases to be what it was before, as is clear from the example of “thirds 
of a man” or “thirds of an arrow.” On the other hand, continuous and 
homogenous magnitudes or objects may easily be divided and put together 
again without losing their essential character. Mashed potatoes offer an 
excellent example of a homogeneous object, which in itself is not separated 
into parts but may nevertheless be divided in practice into as small parts 
as desired. Lengths, areas, and volumes hâve the same property. Although 
they are continuous in their very essence and are not actually divided into 
parts, nevertheless they offer the possibility of being divided without limit. 

Here we encounter two contrasting kinds of objects: on the one hand, 
the indivisible, separate, discrète objects; and on the other, the objects 
which are completely divisible and yet are not divided into parts but are 
continuous. Of course, these contrasting characteristics are always united, 
since there are no absolutely indivisible and no completely continuous 
objects. Yet these aspects of the objects hâve an actual existence, and it 
often happens that one aspect is décisive in one case and the other in 
another. 

In abstracting forms from their content, mathematics by this very act 
sharply divides these forms into two classes, the discrète and the conti¬ 
nuous. 

The mathematical model of a separate object is the unit, and the mathe- 
matical model of a collection of discrète objects is a sum of units, which is, 
so to speak, the image of pure discreteness, purified of ail other qualities. 
On the other hand, the fundamental, original mathematical model of 
continuity is the géométrie figure; in the simplest case, the straight line. 

We hâve before us therefore two contrasts, discreteness and continuity, 
and their abstract mathematical images: the whole number and the 
géométrie extension. Measurement consists of the union of these contrasts: 
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The continuous is measured by separate units. But the inséparable units 
are not enough; we must introduce fractional parts of the original unit. 
In this way the fractional numbers arise and the concept of numbers 
develops precisely as a resuit of the union of the mentioned contrasts. 

Then, on a more abstract level, appeared the concept of incommensur¬ 
able intervals, and, as a resuit, the real number as an abstract image of 
unlimited increase in accuracy in the détermination of a magnitude. This 
concept was not formed immediately, and the long path of its development 
led through many a conflict between these same two contrasting éléments, 
the discrète and the continuous. 

In the first place, Democritus represented figures as consisting of atoms 
and in this way reduced the continuous to the discrète. But the discovery 
of incommensurable intervals led to the abandonment of such a représenta¬ 
tion. After this discovery continuous magnitudes were no longer thought 
of as consisting of separate éléments, atoms or points, and they were not 
represented by numbers, since numbers other than the whole numbers 
and the fractions were not known at that time. 

The contrast between the continuous and the discrète appeared in 
mathematics again with renewed force in the 17th century, when the 
foundations of the difTerential and intégral 
calculus were being laid. Here it was the 
infinitésimal that was under discussion. In 
some accounts the infinitésimal was thought 
of as a real, “actually” infinitésimal, “indivis¬ 
ible” particle of the continuous magnitude, 
like the atoms of Democritus, except that now 
the number of these particles was considered 
to be infinitely great. Calculation of areas 
and volumes, or in other words intégration, 
was thought of as summation of an infinité 
number of these infinitely small particles. 

An area, for example, was understood as “the sum of the Unes from which 
it is formed” (figure 3). Consequently, the continuous was again reduced 
to the discrète, but now in a more complicated way, on a higher level. 
But this point of view also proved unsatisfactory, and, as a counterweight 
to it, there appeared, on the basis of Newton’s work, the notion of 
continuous variables, of the infinitésimal as a continuous variable decreasing 
without limit. This conception finally carried the day at the beginning of 
the 19th century, when the rigorous theory of limits was founded. An 
interval was now thought of as consisting not of points or “indivisibles,” 
but as an extension, as a continuous medium, where it was only possible 
to fix separate points, separate values of a continuous magnitude. Mathe- 



Fig. 3. 
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maticians then spoke of “extension.” In the union of the discrète and the 
continuous, it was again the continuous that dominated. 

But the development of analysis demanded further précision in the 
theory of variable magnitudes and above ail in the general définition of a 
real number as an arbitrary possible value of a variable magnitude. In the 
seventies of the last century there arose a theory of real numbers which 
represents an interval as a set of points, and correspondingly the range of 
variation of a variable as a set of real numbers. The continuous again 
consisted of separate discrète points and the properties of continuity 
were again expressed in the structure of the set of points that formed it. 
This conception led to immense progress in mathematics and became 
dominant. But again profound difficultés were discovered in it, and these 
led to attempts to return on a new level to the notion of pure continuity. 
Other attempts were made to change the concept of an interval as a set 
of points. New points of view appeared for the concepts of number, 
variable, and function. The development of the theory is continuing, and 
we must await its further progress. 

6 . Further results of the interaction of arithmetic and geometry. The 
interaction of geometry and arithmetic played a rôle elsewhere than in 
the formation of the concept of a real number. The same interaction of 
geometry with arithmetic, or more accurately with algebra, also showed 
itself in the formation of négative and complex numbers, that is of 
numbers of the form a + b V^-î. Négative numbers are represented by 
points of the straight line to the left of the point representing zéro. It was 
exactly this géométrie représentation which gave imaginary numbers a 
firm place in mathematics; up to that time they had not been understood. 
New concepts of mangnitude appeared. for example, vectors, which are 
represented by directed line segments; and tensors, which are still more 
general magnitudes; in these again algebra is united with geometry. 

The union of various mathematical théories has always played a great 
and sometimes décisive rôle in the development of mathematics. We shall 
see this further on in the rise of analytic geometry, differential and intégral 
calculus, the theory of functions of a complex variable, the recent so-called 
functional analysis, and other théories. Even in the theory of numbers 
itself, that is in the study of whole numbers, methods are applied with 
great success which dépend on continuity (namely on the infinitésimal 
analysis) and on geometry. These methods hâve given rise to extensive 
chapters in the theory of numbers, the “analytic theory of numbers,” 
and the “geometry of numbers.” 

From a certain well-known point of view, it is possible to regard the 
foundations of mathematics as the union of concepts arising from geometry 
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and arithmetic; that is to say, of the general concepts of continuity and of 
algebraic operations (as generalizations of arithmetic operations). But we 
will not be able to speak here of these difficult théories. The aim of the 
présent chapter has been to give an impression of the general interaction 
of concepts, of the union and the conflict between contrasting ideas in 
mathematics, as illustrated by the interaction of arithmetic and geometry 
in the development of the concept of number. 


§5. The Age of Elementary Mathematics 

1. The four periods of mathematics. The development of mathe¬ 
matics cannot be reduced to the simple accumulation of new theorems 
but includes essential qualitative changes. These qualitative changes 
take place, however, not in a process of destruction or abolition of already 
existing théories but in their being deepened and generalized, so as to 
form more general théories, for which the way has been prepared by 
preceding developments. 

From the most general point of view, we may distinguish in the history 
of mathematics four fundamental, qualitatively distinct periods. Of course, 
it is not possible to draw exact boundary Unes between these stages, since 
the essential traits of each period appeared more or less gradually, but the 
distinctions among the stages and the passages from one to another are 
completely clear. 

The first stage (or period) is the period of the rise of mathematics as an 
independent and purely theoretical science. It begins in the most ancient 
times and extends to the 5th century B.C., or perhaps earlier, when the 
Greeks laid the foundations of “pure” mathematics with its logical 
connection between theorems and proofs (in that century there appeared, in 
particular, systematic expositions of geometry like the “Eléments” of 
Hippocrates of Chios). This first stage was the period of the formation 
of arithmetic and geometry, in the form considered earlier. At this time 
mathematics consisted of a collection of separate rules deduced from 
expérience and immediately connected with practical life. These rules did 
not yet form a logically unified System, since the theoretical character of 
mathematics with its logical proof of theorems was formed very slowly, 
as material for it was accumulated. Arithmetic and geometry were not 
separated but were closely interwoven with each other. 

The second period may be characterized as the period of elementary 
mathematics, of the mathematics of constant magnitudes; its simple 
fundamental results now form the content of a high school course. This 
period extended for almost 2000 years and ended in the 17th century 
with the rise of "higher” mathematics. It is with this period that we will 
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be concerned in greater detail in the présent section. The following 
sections will be devoted to the third and fourth periods, namely to the 
foundingand development of analysis and to the period of contemporary 
mathematics. 

2. Mathematics in Greece. The period of elementary mathematics may 
in its turn be divided into two parts, distinguished by their basic content: 
the period of the development of geometry (up to the 2nd century A.D.) 
and the period of the prédominance of algebra (from the 2nd to the 17th 
century). With respect to historical conditions it is divided into three 
parts, which may be called “Greek,” “Eastern,” and “European Renais¬ 
sance.” The Greek period coincides in time with the general flowering 
of Greek culture, beginning with the 7th century B.C., reaching its culmina¬ 
tion in the 3rd century B.C. at the time of the great geometers of antiquity, 
Euclid, Archimedes, and Apollonius, and ending in the 6th century 
A.D. Mathematics, and especially geometry, enjoyed a wonderful 
development in Greece. We know the names and the results of numerous 
Greek mathematicians, although only a few genuine works hâve corne 
down to us. It is to be remarked that Rome gave nothing to mathematics 
though it reached its zénith in the lst century A.D. at a time when the 
science of Greece, which had been conquered by Rome, was still flour- 
ishing. 

The Greeks not only developed and systematized elementary geometry 
to the extent to which it is given in the “Eléments” of Euclid and is now 

taught in our secondary schools, but 
achieved considerably higher results. 
They studied the conic sections: 
ellipse, hyperbola, parabola; they 
proved certain theorems relating 
to the éléments of what is called 
projective geometry; guided by the 
needs of astronomy, they worked out 
spherical geometry (in the lst century 
A.D.) and also the éléments of 
trigonometry, and calculated the first 
tables of sines (Hipparchus, 2nd 
century B.C. and Claudius Ptolemy, 
2nd century A.D.);* they determined 
Fig. 4. the areas and volumes of a number 

* Ptolemy is widely known as the author of a System in which the Earth is considered 
as the center of the universc and the motion of the heavenly bodies is described as 
proceeding around it. This System was supplanted by the Copernician System. 




§5. THE AGE O F ELEMENTARY MATHEMATICS 


37 


of complicated figures; for example, Archimedes found the area of the 
segment of a parabola by proving that it is 2/3 of the area of the 
rectangle containing it (figure 4). The Greeks were also acquainted 
with the theorem that of ail bodies with a given surface area the sphere 
has the greatest volume, but their proof has not been preserved and 
was probably not complété. Such a proof is quite difficult and was first 
discovered in the 19th century, by means of the intégral calculus. 

In arithmetic and in the éléments of algebra, the Greeks also made 
considérable progress. As was mentioned earlier, they laid the foundation 
for the theory of numbers. Herc belong, for example, their investigations 
on prime numbers (the theorem of Euclid on the existence of an infinité 
number of prime numbers and the "sieve” of Eratosthenes for finding 
prime numbers) and the solution of équations in whole numbers 
(Diophantus about 246-330 A.D.). 

We hâve already said that the Greeks discovered irrational magnitudes 
but considered them geometrically, as line segments. So the problems that 
today we deal with algebraically were treated geometrically by the 
Greeks. It was in this way that they solved quadratic équations and 
transformed irrational expressions. For example, the équation that we 
today write in the form x 2 + ax = b 2 , they stated as follows; Find a 
segment x such that if to the square constructed on it we add a rectangle 
constructed on the same segment and on the given segment a, we obtain 
a rectangle equal in area to a given square. This dominance of geometry 
lasted a long time after the Greeks. They were also acquainted with 
(géométrie) methods for extracting square roots and cube roots and with 
the properties of arithmetic and géométrie progressions. 

In this way the Greeks were already in possession of much of the 
material of contemporary elementary algebra but not, however, of the 
following essential éléments: négative numbers and zéro, irational numbers 
abstracted entirely from geometry, and finally a well-developed system 
of literal symbols. It is true that Diophantus made use of literal symbols 
for the unknown quantity and its powers and also of spécial symbols for 
addition, subtraction, and equality, but his algebraic équations were still 
written with concrète numerical coefficients. 

In geometry the Greeks attained what we now call“higher” mathematics. 
Archimedes made use of intégral calculus for the calculation of areas 
and volumes and Apollonius used analytic geometry in his investigations 
on conic sections. Apollonius actually gives the équations of these curves* 


* He gives the “équations” of conic sections referred to a vertex. For example “the 
équation" of the parabola y 1 = 2px is formulated thus: The square on the side y is 
equal in area to the rectangle with sides 2 p and x. Of course, in place of the symbols 
p , x, y he uses the corresponding line segments. 
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but expresses them in géométrie language. In these équations there 
does not yet appear the general notion of an arbitrary constant or of 
a variable magnitude; and the necessary means of expressing such 
concepts, namely the literal symbols of algebra, appear only at a later 
âge; they alone could convert such investigations into a source of new 
théories, which would be truly a part of higher mathematics. The founders 
of these new théories were guided, a thousand years later, by the legacy 
of the Greek scientists; in fact, the “Geometry” of Descartes (1637), 
which laid the foundation for analytic geometry, begins with a sélection 
of problems left by the Greeks. 

Such is the general rule. The old théories, by giving rise to new and 
profound problems, outgrow themselves, as it were, and demand for 
further progress new forms an new ideas. But these forms and ideas may 
demand new historical conditions for their birth. In ancient society the 
conditions necessary for the passage to higher mathematics did not and 
could not exist; they came on the scene with the development of the 
natural sciences in modem times, a development which in its turn was 
conditioned in the I6th and 17th centuries by the new demands of techno- 
logy and of manufacturing and was connected in this way with the birth 
and development of capitalism. 

The Greeks practically exhausted the possibilities of elementary mathe¬ 
matics, which is the explanation of the fact that the brilliant progress of 
geometry dried up at the beginning of our era and was replaced by 
trigonometry and algebra in the works of Ptolemy, Diophantus, and 
others. In fact, one may consider the works of Diophantus as the beginning 
of the period in which algebra played the leading rôle. But the society 
of the ancients, already verging to its décliné, was no longer able to 
advance science in this new direction. 

It should be noted that, a few centuries earlier, arithmetic had already 
reached a high level in China. The Chinese scientists of the 2nd and Ist 
centuries B.C. described the rules for arithmetical solution of a system of 
three équations of the first degree. It is here for the first time in history 
that négative coefficients are made use of and the rules for operating 
with négative quantities are formulated. But the solutions themselves 
were sought only in the form of positive numbers, just as later in the works 
of Diophantus. These Chinese books also include a method for the 
extraction of square roots and cube roots. 

3. The Middle East. With the end of Greek science a period of 
scientific stagnation began in Europe, the center of mathematical develop¬ 
ment being shifted to India, Central Asia, and the Arabie countries.* 

* To give some orientation in the dates we list here the times of sonie of the out- 
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For a period of about a thousand years, from the 5th to the 15th century, 
mathematics developed chiefly in connection with the demands of com¬ 
putation, particularly in astronomy, since the mathexnaticians of the East 
were for the most part also astronomers. lt is true that they added nothing 
of importance to Greek geometry; in this field they only preserved for 
later times the results of the Greeks. But the Indian, Arabie, and Central 
Asian mathematicians achieved immense successes in the fields of arith- 
metic and algebra.* 

As has been mentioned in §2, the Indians invented our présent system 
of numération. They also introduced négative numbers, comparing the 
contrast between positive and négative numbers with the contrast between 
property and debt or between the two directions on a straight line. Finally, 
they began to operate with irrational magnitudes exactly as with rational, 
without representing them geometrically, in contrast to the Greeks. They 
also had spécial symbols for the algebraic operations, including extraction 
of roots. For the very reason that the Indian and Central Asian scholars 
were no longer embarrassed by the différence between the irrational and 
rational magnitudes, they were able to overcome the “dominance” of 
geometry, which was characteristic of Greek mathematics, and to open 
up paths for the development of contemporary algebra, free of the heavy 
géométrie framework into which it had been forced by the Greeks. 

The great poet and mathematician, Omar Khayyam (about 1048-1122), 
and also the Azerbaijanian mathematician, Nasireddin Tusi (1201-1274), 
clearly showed that every ratio of magnitudes, whether commensurable 
or incommensurable, may be called a number; in their works we find the 
same general définition of number, both rational and irrational, as was 
introduced above in Newton’s formulation, in §4. The magnitude of these 
achievements becomes particularly clear when we recall that complété 
récognition of négative and irrational numbers was attained by European 
mathematicians only very slowly, even after the beginning of the Renais¬ 
sance of mathematics in Europe. For example, the celebrated French 
mathematician Viète (1540-1603), to whom algebra owes a great deal, 
avoided négative numbers, and in England protests against them lasted 
even into the 18th century. These numbers were considered absurd, since 
they were less than zéro, that is “less than nothing at ail.” Nowadays they 


standing mathematicians of the East. From India: Aryabhata, born about 476 A.D.; 
Brahmagupta, about 598-660; Bhaskara, 12th century; from Kharizm: Al-Kharizmi, 
9th century; Al-Biruni, 973-1048; from Azerbaijan: Nasireddin Tusi, 120112-74; 
from Samarkand: Gyaseddin Jamschid, 15th century. 

* One should keep in mind that it is wrong to associate the development of mathe¬ 
matics in this period chiefly with the Arabs. The term •‘Arabie” mathematics came 
into use chiefly because most of the scholars of the East wrote in the Arabie language, 
which had been spread abroad by the Arab conquests. 
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hâve become familiar, if only in the form of négative température; everyone 
reads the newspapers and understands what is mean by “the température 
in Moscow is —8°.” 

The word “algebra” cornes from the name of a treatise of the mathe- 
matician and astronomer Mahommed ibn Musa al-Kharizmi (Mahommed, 
son of Musa, native of Kharizm), who lived in the 9th century. His treatise 
on algebra was called Al-jebr al-muqabala, which means “transposition 
and removai.” By transposition (al-jebr) is understood the transfer of 
négative terms to the other side of an équation, and by removai (al- 
muqabala), cancellation of equal terms on both sides. 

The Arabie word “al-jebr” became in Latin transcription “algebra” 
and the word al-muqabala was discarded, which accounts for the modem 
term “algebra.”* 

The origin of this term corresponds very well to the actual content of 
the science itself. Algebra is basicaliy the doctrine of arithmetical opera¬ 
tions considered formally from a general point of view, with abstraction 
from the concrète numbers. Its problems bring to the fore the formai 
rules for transformation of expressions and solution of équations. Al- 
Kharizmi placed on the title page of his book the actual names of two 
most general formai rules, expression in this way the true spirit of algebra. 

Subsequently, Omar Khayyam defined algebra as the science of solving 
équations. This définition retained its significance up to the end of the 
I9th century, when algebra, along with the theory of équations, struck 
out in new directions, essentially changing its character but not changing 
its spirit of generality as the science of formai operations. 

The mathematicians of Central Asia found methods for calculation, 
both exact and approximate, of the roots of a number of équations; they 
discovered the general formula for the “binomial of Newton,” although 
they expressed it in words; they greatly advanced and systematized the 
science of trigonometry, and calculated very accurate tables of sines. 
These tables were computed, for astronomical purposes, by the mathe- 
matician Gyaseddin (about 1427) who was working with the famous 
Uzbek astronomer Ulug Begh; Gyaseddin also invented décimal fractions 
150 years before they were reinvented in Europe. 

To sum up, in the course of the Middle Ages in India and in Central 
Asia the présent décimal system of numération (including fractions) 
was almost completely built up, as were also elementary algebra and 
trigonometry. During the same period the achievements of Chinese science 
began to make their way into the neighboring countries; about the 6th 


* It is to be noted also that the mathematical term “algorithm,” denoting a method 
or set of rules for computation, cornes from the name of the same al-Kharizmi. 
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century B.C. the Chinese already had methods for the solution of the 
simplest indeterminate équations, for approximate calculations in geom- 
etry, and for the first steps in approximate solution of équations of the 
third degree. Essentially the only parts of our présent high school course 
in algebra that were not known before the 16th century were logarithms 
and imaginary numbers. However, there did not yet exist a System of 
literal symbols: The content of algebra had outdistanced its form. Yet the 
form was indispensable: The abstraction from concrète numbers and the 
formulation of general rules demanded a corresponding method of 
expression; it was essential to hâve some way of denoting arbitrary 
numbers and operations on them. The algebraic symbolism is the necessary 
form corresponding to the content of algebra. Just as in remote antiquity 
it had been necessary, in order to operate with whole numbers, that 
symbols should be invented for them, so now, to operate with arbitrary 
numbers and to give general rules for their use, it was necessary to work 
out corresponding symbols. This task, begun at the time of the Greeks, 
was not brought to completion until the 17th century, when the présent 
System of symbols was finally set up in the works of Descartes and 
others. 

4. Renaissance Europe. At the time of the Renaissance the Europeans 
became acquainted with Greek mathematics by way of the Arabie transla¬ 
tions. The books of Euclid, Ptolemy, and Al-Kharizmi were translated 
in the 12th century from Arabie into Latin, the common scientific language 
of Western Europe, and at the same time, the earlier System of calculation, 
as derived from the Greeks and Romans, was gradually replaced by the 
present-day Indian method, which was borrowed by the Europeans 
from the Arabs. 

It was only in the 16th century that European science finally surpassed 
the achievements of its predecessors. Thus the Italians, Tartaglia and 
Ferrari, solved the general cubic équation, and later, the general équation 
of the fourth degree (see Chapter IV). Let us note that although these 
results are not taught in school, they belong, with respect to the methods 
employed in them, to elementary algebra. To higher algebra we must 
however refer the general theory of équations. 

During the same period imaginary numbers began for the first time to 
be used; at first this was done in a purely formai manner, without logical 
foundation, which came considerably later at the beginning of the I9th 
century. Our present-day algebraic symbols were also worked out; in 
particular, literal symbols were used byViètein 1591 not only for unknown 
quantifies but also for given ones. 

Many mathematicians took a share in this development of algebra. At 
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the same time décimal fractions appeared in Europe; they were invented 
by the Dutch scholar Stevin, who wrote about them in 1585. 

Finally, Napier in Great Britain invented logarithms as an aid in 
astronomical calculations and wrote about them in 1614; Briggs calculated 
the first décimal tables of logarithms, which were published in 1624.* 

At the same time there appeared in Europe the “theory of combinations” 
and the general formula for the“ binomial of Newton”;t the progressions 
being already known, and in this way the structure of elementary algebra 
was completed. Therewith came to an end, at the beginning of the 17th 
century, the whole period' of the mathematics of constant magnitudes, 
of elementary mathematics as it is now taught, with a few additions, in 
our schools. Arithmetic, elementary geometry, trigonometry, and elemen¬ 
tary algebra were now essentially complété. There followed a transition 
to higher mathematics, to the mathematics of variable magnitudes. 

It is not to be thought, however, that the development of elementary 
mathematics ceased at this time; for example, new results were discovered 
and are being constantly discovered today in elementary geometry. 
Furthermore, it is precisely because of the subséquent development of 
higher mathematics that we now understand more clearly the essential 
nature of elementary mathematics itself. But the leading rôle in mathe¬ 
matics was now taken over by the concepts of variable magnitude, 
function, and limit. The problems, that led from elementary mathematics 
to higher mathematics are nowadays clarified and solved by the concepts 
and methods of higher mathematics (occasionally they are not solvable 
at ail by elementary methods), and there are other problems which may 
be stated in terms of elementary mathematics but which serve even 
today as a source of more general results and even of entire théories. 
Examples are provided by the earlier mentioned theory of regular Systems 
of figures or by problems of the theory of numbers which are elementary 
in their formulation but far from elementary in the methods by 
which they are solved. For further details the reader may consult 
Chapter X. 


* It is interesting to note that Napier did not define logarithms as they are defined 
nowadays, when we say that in the formula x = a” the number y is the logarithm of x 
to the base a. This définition of logarithms appeared later. Napier’s définition was 
related to the concepts of a variable magnitude and an infinitésimal and amounted 
to saying that the logarithm of x is a function y = f(x) whosc rate of growth is inversely 
proportional to x ; that is, y = c/x (see Chapter II). In this way the basis of the définition 
was essentially a differential équation, defining the logarithm, although differentials 
had not yet been invented. 

t The formula bears the namc of Newton not because he was the first to discover it 
but because he generalized it from intégral exponents to arbitrary fractional and 
irrational exponents. 
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§6. Mathematics of Variable Magnitudes 

1. Variable and function. In the I6th century the investigation of 
motion was the central problem of physics. The physical sciences were 
led to this problem, and to the study of various other involving inter- 
dependence of variable magnitudes, by the demands of practical life and 
by the whole development of science itself. 

As a reflection of the general properties of change, there arose in 
mathematics the concepts of a variable magnitude and a function, and it 
was this cardinal extension of the subject matter of mathematics that 
determined the transition to a new stage, to the mathematics of variable 
magnitudes. 

The law of motion of a body in a given trajectory, for example along a 
straight line, is defined by the manner in which the distance covered by 
the body increases with time. 

Thus Galileo (1564-1642) discovered the law of falling bodies by 
establishing that the distance fallen increases proportionally to the square 
of the time. This fact is expressed in the well-known formula 

s = *Ç-, ( 1 ) 

where g is approximately equal to 9.81 m/sec 2 . 

In general, the law of motion expresses the distance covered in the time 
t. Here the time l and the distance s are respectively the “independent” 
and the “dépendent” variable, and the fact that to each time l there cor¬ 
responds a definite distance s is what is meant by saying that the distance i 
is a function of the time t. 

The mathematical concepts of variable and function are the abstract 
generalization of concrète variables (such as time, distance, velocity, 
angle of rotation, and area of surface traced out) and of the interdepend- 
ences among them (the distance dépends on the time and so forth). Jut as 
the concept of a real number is the abstract image of the actual value of 
an arbitrary magnitude, so a “variable” is the abstract image of a varying 
magnitude, which assumes various values during the process under 
considération. A mathematical variable x is “something” or, more 
accurately, “anything” that may take on various numerical values. This 
is the meaning of a variable in general; in particular, we may understand 
by it the time, the distance, or any other variable magnitude. 

In exactly the same way, a function is the abstract image of the depend- 
ence of one magnitude on another. The assertion that y is a function of x 
means in mathematics only that to each possible value of x there corre¬ 
sponds a definite value of y. This correspondence between the values 
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of y and the values of x is called a function. For example, according to 
the law of falling bodies, the distance covered corresponds to the time of 
fall by formula (1). The distance is a function of the time. Let us look at 
some other examples. 

The energy of a falling body is expressed by its mass and its velocity 
according to the formula 


£ = 


mv 2 
2 ' 


( 2 ) 


For a given body the energy is a function of the velocity v. 

By a familiar law the quantity of heat generated in a conductor in unit 
time by the passage of an electric current is expressed by the formula 


Q 


RP 
2 ’ 


(3) 


where / is the magnitude of the current and R is the résistance of the 
conductor. For a given résistance there corresponds to every current / 
a definite amount of heat Q , generated in unit 
time. That is, Q is a function of I. 

The area of a right-angled triangle S with a 
given acute angle oc and corresponding side x 
(see figure 5) is expressed by the formula 



S = % x 2 tan oc. 


(4) 


For a given angle oc the area is a function of 
the side x. 

Ail these formulas ( 1)—(4) may be united in the one 


y-t 


ax‘ 


(5) 


This general formula represents a transition from the concrète variable 
magnitudes t, s, E, Q, v and so forth to the general variables x and y, and 
from the concrète dependences (1), (2), (3), (4) to their general form (5). 
Mechanics and the theory of electricity hâve to do with concrète formulas 
(1), (2), (3), interrelating concrète magnitudes, but the mathematical 
theory of functions deals with the general formula (5), without associating 
this formula with any concrète magnitudes. 

The next degree of abstraction from the concrète consists in our exam- 
ining not a given dependence of y on x, like y = £ ax 2 , y = sin x, y = log x 
and so forth, but the general dependence of y on x expressed in the abstract 
formula 


y=A*)- 
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This formula States that the magnitude y is in general some function of at; 
that is, to each value assumed by a: there corresponds, in some fashion 
or another, a definite value y. The subject matter of mathematics thus 
consists not only of certain given functions (y = £ ax 2 , y =-- sin a:, and so 
forth), but of arbitrary (more accurately, more or less arbitrary) functions. 
These degrees of abstraction, first from concrète magnitudes and then 
from concrète functions, are analogous to the degrees of abstraction 
observed in the formation of the concept of a whole number: First, 
abstraction from concrète collections of objects led to the concept of whole 
numbers (1, 3, 12, and so forth), and then a further abstraction led to the 
concept of an arbitrary whole number in general. This generalization is 
the resuit of a profound interraction between analysis and synthesis: 
analysis of separate interrelations and synthesis, in the form of new con¬ 
cepts, of their common features. 

The branch of mathematics devoted to the study of functions is called 
analysis, or often, infinitésimal analysis, since one of the most important 
éléments in the study of functions is the concept of the infinitésimal (the 
meaning of this concept and its significance are explained in Chapter II). 

Since a function is the abstract image of a dependence of one magnitude 
on another, we may say that analysis takes as its subject matter depend- 
ences between variable magnitudes, not between one concrète magnitude 
and another but between variables in general, in abstraction from their 
content. An abstraction of this sort guarantees great breadth of applica¬ 
tion, since one formula or one theorem contains an infinité number of 
possible concrète cases. An example of this is given already by our simple 
formulas (1H5). So the complété analogy of analysis with arithmetic and 
algebra becomes évident. They ail originate in definite practical problems 
and give a general abstract expression to concrète relationships in the 
actual world. 

2. Analytic geometry and analysis. Thus the new period of mathe¬ 
matics, beginningin the 17th century, may be defined as the period of the 
birth and development of analysis. (This is the third of the three important 
periods mentioned earlier.) It is to be understood, of course, that no 
theory arises as a resuit of the mere formation of new concepts, that 
analysis could not resuit from the mere existence of the concepts of variable 
and function. For the founding of a theory, and ail the more of a complété 
branch of science like mathematical analysis, it is necessary that the new 
concepts become active, so to speak, that among them there be discovered 
new relationships, and that they permit the solution of new problems. 

But more than that, new concepts can originate and develop, and 
become more general and précisé, only on the basis of the very problems 
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they enable us to solve, only through those theorems of which they forcn 
a part. The concepts of variable and function did not arise in complété 
form in the mind of Galileo, Descartes, Newton, or anybody else. They 
occurred to many mathematicians (for example Napier in connection with 
logarithms) and gradually assumed a more or less clear, but still by no 
means final, form with Newton and Leibnitz, being made still more 
précisé and general in the subséquent development of analysis. Their 
present-day définition was laid down only in the 19th century, but even 
it is not absolutely rigorous or altogether final. The development of the 
concept of a function is continuing even at the présent time. 

Mathematical analysis was based on material furnished by the new 
science of mechanics, and on problems of geometry and algebra. The first 
definite step toward the mathematics of variable magnitudes was the 
appearance in 1637 of the “geometry” of Descartes, where the foundations 
were laid for the so-called analytic geometry. The basic ideas of Descartes 
are as follows. 

Suppose we are given, for example, the équation 

x 1 + y 2 = a 2 . (6) 


In algebra x and y were understood as unknowns, and since the given 
équation does not allow us to détermine them, it did not présent any 
essential interest for algebra. But Descartes did not consider x and y as 
unknowns, to be found from the équation, but as variables; so that the 
given équation expresses the interdependence of 
two variables. Such an équation may be written 
in general form, by taking ail its terms to the 
•s)(x,y) left-hand side, thus: 

F(x, y) = 0. 

Further, Descartes introduced into the plane 
> the coordinates x, y which are now called 
x Cartesian (figure 6). In this way, to each pair 
of values x and y there corresponds a point, 
and conversely to each point there corresponds 
a pair of coordinates x, y. Consequently, the 
équation F(x, y) = 0 détermines the géométrie locus of those points on 
the plane whose coordinates satisfy the équation. In general, this will be 
a curve. For example, équation (6) détermines the circumference of a 
circle of radius a with center at the origin. In fact, as is obvious from 
figure 7, by the theorem of Pythagoras, x 2 + y 2 is the square of the 
distance from the origin O to the point M with coordinates x and y. 
So équation (6) represents the géométrie locus of those points whose 


O 


Fig. 6. 
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distance from the origin is equal to a , which is the circumference of a 
circle. 

Conversely, a géométrie locus 
of points, given by a géométrie 
condition, may also be given by 
an équation expressing the same 
condition in the language of 
algebra by means of coordinates. 

For example, the géométrie con¬ 
dition defining the circumference 
of a circle, namely that it is a 
géométrie locus of points équi¬ 
distant from a given point, may 
beexpressed in algebraic language 
by équation (6). 

Thus the general problem and the general method of analytic geometry 
are as follows: We represent a given équation in two variables by a curve 
on the plane, and from the algebraic properties of the équation we investi- 
gate the géométrie properties of the corresponding curve; and conversely, 
from the géométrie properties of the curve we find the équation, and then 
from the algebraic properties of the équation we investigate the géométrie 
properties of the curve. In this way géométrie problems may be reduced 
to algebraic, and so finally to computation. 

The content of analytic geometry will be discussed in detail in Chapter 
III. We now wish to direct attention to the fact that, as is évident from 
our short explanation, it originated in a union of geometry, algebra, and 
the general idea of a variable magnitude. The main géométrie content of 
the early beginnings of analytic geometry was the theory of conic sections, 
ellipse, hyperbola, and parabola. This theory, as we hâve pointed out, 
was developed by the ancient Greeks; the results of Apollonius already 
contained in géométrie form the équations of the conic sections. The union 
of this géométrie content with algebraic form, developed after the time of 
the Greeks, and with the general idea of a variable magnitude, arising from 
the study of motion, produced analytic geometry. 

Among the Greeks the conic sections were a subject of purely mathe- 
matical interest, but by the time of Descartes they were of practical 
importance for astronomy, mechanics, and technology. Kepler (1571-1630) 
discovered that the planets move around the sun in ellipses, and Galileo 
established the fact that a body thrown in the air, whether it is a stone 
or a cannonball, moves along a parabola (to the first approximation, if 
we may neglect air résistance). As a resuit, the calculation of various 
magnitudes referring to the conic sections became an urgent necessity. 
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and it was the method of Descartes that solved this problem. So the 
way was prepared for his method by the preceding development of 
mathematics, and the method itself was brought into existence by the 
insistent demands of science and technology. 

3. Differential and intégral calculus. The next décisive step in the mathe¬ 
matics of variable magnitudes was taken by Newton and Leibnitz during 
the second half of the 17th century, in the founding of the differential 
and intégral calculus. This was the actual beginning of analysis, since the 
subject matter of this calculus is the properties of functions themselves, 
as distinct from the subject matter of analytic geometry, which is géométrie 
figures. In fact Newton and Leibnitz only brought to completion an im¬ 
mense amount of preparatory work, shared by many mathematicians and 
going back to the methods for determining areas and volumes worked 
out by the ancient Greeks. 

Here we shall not explain the fundamental concepts of differential and 
intégral calculus and of the théories of analysis that followed them, since 
this will be done in the spécial chapters devoted to these théories. We wish 
only to draw attention to the sources of the calculus, which were mainly 
the new problems of mechanics and the old problems of geometry, the 
latter consisting of drawing a tangent to a given curve and of determining 
areas and volumes. These géométrie problems had already been studied 
by the ancients (it is sufficient to mention Archimedes), and also by 
Kepler, Cavalieri, and others at the beginning of the 17th century. 
But the décisive event was the discovery of the remarkable relation 
between these two types of problems and the formulation of a general 
method for solving them; this was, the achievement of Newton and 
Leibnitz. 

This relation, allowing us to connect the problems of mechanics with 
these of geometry, was discovered because of the possibility, arising from 
the method of coordinates, of making a graphical représentation of the 
dependence of one variable on another, or in other words of a function. 
With the help of this graphical représentation, it is easy for us to formulate 
the earlier mentioned relation, between the problems of mechanics and 
geometry, which was the source of the differential and intégral calculus, 
and consequently to describe the general content of these two types of 
calculus. 

The differential calculus is basically a method for finding the velocity 
of motion when we know the distance covered at any given time. This 
problem is solved by “différentiation.” It turns out that the problem is 
completely équivalent to that of drawing a tangent to the curve repre- 
senting the dependence of distance on time. The velocity at the moment t 
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is equal to the slope of the tangent to the curve at the point corresponding 
to t (figure 8). 

The intégral calculus is basically a method of finding the distance 
covered when the velocity is known, or more generally of finding the total 
resuit of the action of a variable magnitude. This problem is obviously 




the converse of the problem of the differential calculus (the problem of 
finding the velocity); it is solved by “intégration.” It turns out that the 
problem of intégration is compietely équivalent to that of finding the 
area under the curve representing the dependence of the velocity on time. 
The distance covered in the interval of time from the moment t x to the 
moment t 2 is equal to the area under the curve between the straight Unes 
corresponding on the graph to the values fj and t 2 (figure 9). 

By abstracting from the mechanical formulation of the problems of 
the calculus and by dealing with functions rather than with dependence 
of distance or velocity on time, we obtain the general concept of the 
problems of differential and intégral calculus in abstract form. 

Fundamental to the calculus, as to the whole subséquent development 
of analysis, is the concept of a limit, which was formulated somewhat 
later than the other fundamental concepts of variable and function. In the 
early days of analysis the rôle later played by the limit was taken by the 
somewhat nebulous concept of an infinitésimal. The methods for actual 
calculation of velocity, given the distance covered (namely, différentiation), 
and of distance, given the velocity (intégration), were founded on a union 
of algebra with the concept of limit. Analysis originated in the application 
of these concepts and methods to the aforementioned problems of 
mechanics and geometry (and also to certain other problems; for example, 
problems of maxima and minima). The science of analysis was in turn 
absolutely necessary for the development of mechanics, in the formulation 
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of whose laws its concepts had already appeared in latent form. For 
example, the second law of Newton, as formulated by Newton himself, 
States that “the change in momentum is proportional to the acting force” 
(more precisely: The rate of change of momentum is proportional to the 
force). Consequently, if we wish to make any use of this law, we must 
be able to define the rate of change of a variable, that is, to differentiate. 
(If we State the law in the form that the accélération is proportional to the 
force, the problem remains the same, because accélération is proportional 
to rate of change of momentum.) Also, it is perfectly clear that in order to 
State the law governing a motion when the force is variable (in other words, 
the motion proceeds with a variable accélération), we must be able to 
solve the inverse problem of finding a magnitude given its rate of change; 
in other words, we must be able to integrate. So one might say that Newton 
was simply compelled to invent différentiation and intégration in order 
to develop the science of mechanics. 

4. Other branches of analysis. Along with the differential and integra! 
calculus, other branches of analysis arose: The theory of sériés (see Chapter 
II, §14), the theory of differential équations (Chapters V and VI), and the 
application of analysis to geometry, which later became a spécial branch 
of geometry, called differential geometry and dealing with the general 
theory of curves and surfaces (Chapter VII). Ail these théories were 
brought to life by the problems of mechanics, physics, and technology. 

The theory of differential équations, the most important branch of 
analysis, has to do with équations in which the unknown is no longer a 
magnitude but a function, or in other words a law governing the depend- 
ence of one magnitude on another or on several others. lt is easy to under- 
stand how such équations arose. In mechanics we seek to détermine the 
whole law of motion of a body under given conditions and not just one 
value of the velocity or of the distance covered. In the mechanics of fluids 
it is necessary to find the distribution of velocity over the whole mass of 
fluid in motion, or in other words to find the dependence of the velocity 
on ail three space coordinates and on time. Analogously, in the theory 
of electricity and magnetism we must find the tension in the field 
throughout ail space; that is, the dependence of this tension on the same 
three space coordinates, and similarly in other cases. 

Problems of this sort arose continually in the various branches of 
mechanics, including hydrodynamics and the theory of elasticity, in acous- 
tics, in the theory of electricity and magnetism, and in the theory of heat. 
From the very moment of its birth, analysis remained in close contact 
with mechanics and with physics in general, its most important achieve- 
ments being invariably connected with the solution of problems posed 



§6. MATHEMATICS O F VARIABLE MAGNITUDES 


51 


by the exact sciences. Beginning with Newton, the greatest analysts, 
D. Bernoulli (1700-1782), L. Euler (1707-1783), J. Lagrange (1736-1813), 
H. Poincaré (1854-1912), M. V. Ostrogradskil (1801-1861) and A. M. 
Lyapunov (1857-1918), as well as many others who laid newfoundations 
in analysis, started as a rule from the urgent problems of contemporary 
physics. 

In this way new théories arose: In direct connection with mechanics, 
Euler and Lagrange founded a new branch of analysis, called the calculus 
of variations (see Chapter VIII), and at the end of the 19th century 
Poincaré and Lyapunov, starting again from the problems of mechanics, 
founded the so-called qualitative theory of differential équations (see 
Chapter V, §7). 

In the 19th century analysis was enriched by an important new branch, 
the theory of functions of a complex variable (see Chapter IX). The rudi¬ 
ments of it are to be found in the works of Euler and certain other mathe- 
maticians, but its transformation into a well-formed theory took place in 
the middle of the 19th century and was carried out to a great extent by 
the French mathematician Cauchy (1789-1857). This theory rapidly 
underwent an imposing development with numerous significant results 
that allowed mathematicians to penetrate more deeply into many of the 
laws of analysis and found important applications in problems of mathe- 
matics itself, and of physics and technology. 

Analysis developed rapidly; not only did it form the center and the most 
important part of mathematics but it also penetrated into the older 
régions: algebra, geometry, and even the theory of numbers. Algebra 
began to be thought of as basically the doctrine of functions expressed 
in the form of polynomials of one or several variables.* Analytic and 
differential geometry began to dominate the field of geometry. As far 
back as Euler, methods of analysis were introduced into the theory of 
numbers and formed in this way the beginning of the so-called analytic 
theory of numbers, which contains some of the most profound achève¬ 
ments ofthe science of whole numbers. 

Through the influence of analysis, with its concepts of variable, function, 
and limit, the whole of mathematics was penetrated by the idea of motion 
and change, and therefore of dialectic. In exactly the same way, basically 
through analysis, mathematics was affected by the exact sciences and 


* Polynomials are functions of the form y = a^x" + a l x nl + ••• +• a„ . The funda- 
mental problem of the algebra of the period, namely the solution of the équation 
OaX" + a,*"' 1 + + a n = 0, simply means the search for values of x for which 

the function y = a^x" + + ••• + a. is equal to zéro. The very existence of a 

solution, of a root of the équation, which is called the fundamental theorem of algebra, 
is proved by means of analysis (see Chapter IV, §3). 
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technology and in turn played a rôle in their development, since it was the 
means of giving exact expression to their laws and of solving their 
problems. Just as among the Greeks mathematics was basically geometry, 
one may say that after Newton it was basically analysis. Of course, analysis 
did not completely absorb the whole of mathematics; in geometry, in the 
theory of numbers, and in algebra the problems and methods character- 
istic of these sciences were everywhere continued. Thus in the 17th century 
there arose, along with analytic geometry, another branch of geometry, 
namely projective geometry, in which purely géométrie methods played a 
dominant rôle. It originated chiefly in problems of the représentation of 
objects on a plane (projection), and as a resuit it is particularly useful in 
descriptive geometry. 

At the same time there was developed an important new branch of 
mathematics, the theory of probability, which takes as its subject matter 
the uniformities observable in large masses of phenomena, such as a long 
sériés of rifle shots or tosses of a coin. In the succeeding period it acquired 
a spécial importance in physics and technology and its development 
was conditioned by the problems which came to it from those branches of 
science. The characteristic feature of this theory is that it deals with the laws 
of “random events,” providing mathematical methods for investigation 
of the irrégularités that necessarily appear in random events. The 
basic features of the theory of probability will be explained in Chapter XI. 

5. Applications of analysis. Analysis in ail its branches provided 
physics and technology with powerful methods for the solution of problems 
of many different kinds. We hâve already mentioned the earliest of these: 
to find the rate of change of a magnitude when we know how the magnitude 
itself dépends on time; to find the area of curvilinear figures and the 
volumes of solids; and to find the total resuit of some process or another 
or the total action of a variable magnitude. Thus, the intégral calculus 
allows us to détermine the work done by an expanding gas as the pressure 
changes according to a well-known law; the same intégral calculus allows 
us to compute, for example, the tension of an electric field with an 
arbitrarily given System of charges, basing our work on the law of Coulomb 
which détermines the tension of a field resulting from a point charge, and 
so forth. 

Further, analysis provided a method for finding the maximum and the 
minimum values of a magnitude under given conditions. Thus, with the 
help of analysis it is easy to détermine the shape of a cylindrical cistern 
which for a given volume will hâve the smallest surface and consequently 
will require the smallest outlay of material. It turns out that the cistern 
will hâve this property if its height is equal to the diameter of its base 
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(figure 10). Analysis allows us to détermine the shape of the curve along 
which a body must roll in order to fall in the shortest time front one given 
point to another (this curve is the so-called cycloid; figure 11). 



For the solution of these and other problems the reader may turn to 
Chapters II and VIII. 

Analysis, or more precisely the theory of differential équations, allows 
us not merely to find separate values for variable magnitudes but also 
to détermine unknown functions; that is, to find laws of dependence of 
certain variables on others. Thus we hâve the possibility, on the basis of 
the general laws of electricity, of computing how the current varies with 
time in a circuit with arbitrary résistance, capacitance, and self-induction. 
We can détermine laws for the distribution of velocities throughout the 
whole mass of a fluid under given conditions. We can deduce general 
laws for the vibration of strings and membranes, and for the propagation 
of vibrations in various media; here we are referring to sound waves, 
electromagnetic waves, or elastic vibrations propagated through the Earth 
by earthquakes or explosions. Parenthetically, we may remark that new 
methods are thereby provided for searching for useful minerais and for 
carrying out investigations far below the surface of the Earth. Individual 
problems of this sort will be found in Chapters V and VI. 

Finally, analysis not only provides us with methods for solving spécial 
problems; it also gives us general methods for mathematical formulation 
of the quantitative laws of the exact sciences. As was mentioned, earlier, 
the general laws of mechanics could not be formulated mathematically 
without recourse to the concepts of analysis, and without such a formula¬ 
tion we would not be able to solve the problems of mechanics. In exactly 
the same way the general laws for heat conduction, diffusion through 
porous materials, propagation of vibrations, the course of Chemical 
reactions, the basic laws of electromagnetism, and many other laws 
simply could not be given a mathematical formulation without the concepts 
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of analysis. It is only as the resuit of such a formulation that these laws 
can be applied to the most varied concrète cases, providing a basis for 
exact mathematical conclusions in the spécial problems of heat conduction, 
vibrations, Chemical solution, electromagnetic fields, and other problems 
of mechanics, astronomy, and ail the numerous branches of physics, 
chemistry, heat engineering, power, machine construction, electrical 
engineering, and so forth. 

6. Critical examination of the foundations of analysis. Just as in the 
history of geometry among Greeks the rigorous and systematic présenta¬ 
tion given by Euclid brought to completion a long previous development, 
so in the development of analysis there arose the necessity of placing it 
upon a firmer basis than had been provided by the first creators of its 
powerful methods: Newton, Euler, Lagrange, and others. As the analysis 
founded by them grew more extensive, it began, on the one hand, todeal 
with more profound and difficult problems, and on the other, to require 
from its very extent a more systematic and carefully reasoned basis. The 
growth of the theory necessitated a systematization and critical analysis 
of its foundations. To put a theory on a firm foundation requires examina¬ 
tion of its entire development and should by no means be considered as a 
starting point for the theory itself, since without the theory we would 
simply hâve no idea of what it is that we need to provide with a foundation. 
By the way, certain contemporary formalists forget this fact when they 
consider it advisable to found and develop a theory starting from axioms 
that hâve not been selected on the basis of any analysis of the actual 
material which they are supposed to summarize. But the axioms themselves 
require a justification of their content; they only sum up other material and 
providea foundation for the logical construction of a theory.* 

The necessary period of criticism, systematization, and laying of founda¬ 
tions occurred in analysis at the beginning of the last century. Though the 
efforts of a number of eminent scientists this important and difficult work 
was brought to a successful completion. In particular, précisé définitions 
were given for the basic concepts of real number, variable, function, limit, 
and continuity. 

However, as we hâve already had occasion to mention, none of these 
définitions may be considered as absolutely rigorous or final. The develop¬ 
ment of these concepts is continuing. Euclid and ail the mathematicians 
in the course of 2,000 years after him no doubt considered his “Eléments” 

* This double rôle of lhe axioms is sometimes lost from view even in works of a 
methodological character, which thereby attribute to the construction of axioms a 
significance which does not at ail belong to it, namely that of the total construction 
of a theory. 
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as the practical limit of logical rigor. But to a contemporary view the 
Euclidean foundations of geometry seem quite superficial. This historical 
example shows that we ought not to flatter ourselves with any idea of 
“absolute” or “final” rigor in contemporary mathematics. In a science 
that is not yet dead and mummified, there is not and cannot be anything 
perfect. But we can say with confidence that the foundations of analysis 
as they exist at présent correspond in a quite satisfactory way to the con¬ 
temporary problems of science and the contemporary conception of logical 
précision; and second, that thecontinued deepening of these concepts and 
the discussions that are now taking place about them give us no cause, 
and will not give us cause, simply to reject them; these discussions will 
lead us to a new, more précisé, and more profound understanding, the 
results of which it is still difficult to estimate. 

Although the establishment of the basic principles of a theory forms a 
summary of its development, it does not represent the end of the theory; 
on the contrary, it is conducive to further development. This is exactly 
what happened in analysis. Fn connection with the deepening of its founda¬ 
tions there arose a new mathematical theory, created by the German 
mathematician Cantor in the seventies of the last century, namely the 
general theory of infinité sets of arbitrary abstract objects, whether 
numbers, points, functions or any other “éléments”. On the basis of these 
ideas there grew up a new chapter in analysis, the so-called theory of 
functions of a real variable, whose concepts, along with t hose of the founda¬ 
tions of analysis and the theory of sets, are explained in Chapter XV. 
At the same time the general ideas of the theory of sets penetrated every 
branch of mathematics. But this “set-theoretical point of view” is 
inseparablyconnected with a new stage in the development of mathematics, 
which we will now consider briefly. 


§7. Contemporary Mathematics 

1. The more advanced character of present-day mathematics. To the 

four stages of the develoment of mathematics mentioned in §5 there 
naturally correspond stages in our mathematical éducation, the material 
learned at each stage of our study consisting, to a fair degree of approxima¬ 
tion, of the basic content of the corrresponding period in the history of 
mathematics. 

The basic results of arithmetic and geometry, obtained in the first period 
of the development of mathematics, form the subject of primary éducation 
and are known to us ail. For example, when we détermine the quantity 
of material necessary to cary out a certain job, let us say to cover a floor, 
we are already making use of these first results of mathematics. The most 
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important achèvements of the second period, the period of elementary 
mathematics, are taught in the high schools. The basic results of the third 
period, the foundations of analysis, the theory of difTerential équations, 
higher algebra, and so forth, form the mathematical instruction of an 
engineer; they are studied in ail the schools of higher éducation, except 
those devoted purely to the humanities. In this way the basic ideas and 
results of the mathematics of that period are widely known, use being 
made of them to some extent by almost every engineer and scientist. 

On the other hand, the ideas and results of the present-day period 
mathematics are studied almost exclusively in graduate departments 
of mathematics and physics. Beside mathematical specialists, they are used 
by researchers in the fields of mechanics and physics, and in a number of 
the newer branches of technology. Of course, this does not at ail mean 
that they hâve no practical application, but since they represent the most 
recent results of science, they are naturally more complicated. Conse- 
quently, as we now pass to a general description of the latest stage in the 
development of mathematics we can no longer consider that everything 
which we mention briefly will be altogether clear. We will try to présent 
in a few Unes the most general character of the new branches of mathe¬ 
matics; their content will be explained in greater detail in the corresponding 
chapters ofthe book. 

If the présent section seems overly difficult it may be passed over at 
first reading and taken up again after study of the spécial chapters. 

2. Geometry. The beginning of the present-day development of 
mathematics is characterized by profound changes in ail its basic fields: 
algebra, geometry, and analysis. This change may perhaps be followed 
most clearly in the field of geometry. In the year 1826 Lobaôevskil, and 
almost simultaneously with him the Hungarian mathematician Janos 
Bolyai, developed the new non-Euclidean geometry. The ideas of 
Lobaéevskil were far from being immediately understood by ail mathe- 
maticians. They were too bold and unexpected. But from this moment 
there began a fundamental new development of geometry; the very 
conception of what is meant by geometry was changed. Its subject matter 
and the range of its applications were rapidly extended. The most im¬ 
portant step, after Lobaàesvskil, in this direction was taken in 1854 by 
the celebrated German mathematician, Riemann. He clearly formulated 
the general idea that an unlimited number of “spaces” could be investi- 
gated by geometry, and at the same time he indicated their possible 
significance in the real world. In the new development of geometry two 
features were characteristic. 

In the first place the earlier geometry studied only the spatial forms and 
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relations of the material world, and then only to the extent in which they 
appear in the framework of Euclidean geometry, but now the subject 
matter of geometry began to include also many other forms and relations 
of the actual world, provided only they were similar to the spatial 
ones and therefore allowed the use of géométrie methods. The term 
“space” thereby took on in mathematics a new meaning, broader and at 
the same time more spécial. Simultaneously, the methods of geometry 
became much richer and more varied. In their turn they provide us with 
more complété means for learning about the physical world around us, 
the world from which geometry in its original form was abstracted. 

In the second place, even in Euclidean geometry important progress 
was made: In it were studied the properties of incomparably more compli- 
cated figures, even including arbitrary sets of points. Also a fundamentally 
new attitude appeared toward the properties of the figures under investiga¬ 
tion. Separate groups of properties were distinguished, which could be 
investigated in abstraction from others, and this very abstraction within 
geometry gave rise to many characteristic branches of the subject, which 
essentially became independent “geometries.” The development of 
geometry in ail these directions is being continued and more and more 
new “spaces” and their “geometries” are being studied: the space of 
LobaCesvskil, projective space, Euclidean and other spaces of various 
dimensions, in particular four-dimensional space, Riemann spaces, Finsler 
spaces, topological spaces, and so forth. These théories find important 
application in mathematics itself, outside of geometry, and also in physics 
and mechanics; particularly noteworthy are their applications in the theory 
of relativity of contemporary physics, which is a theory of space, time, 
and gravitation. From what has been said it is clear that we are dealing 
here with a qualitative change in geometry. 

The ideas of contemporary geometry and some of the éléments of the 
theory of various spaces investigated in it will be explained in Chapters 
XVII and XVIII. 

3. Algebra. Algebra too underwent a qualitative change. In the first 
half of the 19th century new théories arose, which led to changes in its 
character, and to an extension of its subject matter and its range of 
application. 

In its original form, as pointed out in §5, algebra dealt with mathematical 
operations on numbers considered from a formai point of view, in abstrac¬ 
tion from given concrète numbers. This abstraction found expression in 
the fact that in algebra magnitudes are denoted by letters, on which 
calculations are carried out according to well-known formai rules. 

Contemporary algebra retains this basis but widens it in a very extensive 
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way. It now considers “magnitudes” of a much more general nature 
than numbers, and studies operations on these “magnitudes” which are 
to some extent analogous in their formai properties to the ordinary 
operations of arithmetic: addition, subtraction, multiplication, and divison. 
A very simple example is offered by vector magnitudes, which may be 
added by the well-known parallelogram rule. But the generalization carried 
out in contemporary algebra is such that even the very term “magnitude” 
often loses its meaning and one speaks more generally of “éléments” on 
which it is possible to perform operations similar to the usual algebraic 
ones. For example, two motions carried out one after the other are 
evidently équivalent to a certain single motion, which is the sum of the 
two; two algebraic transformations of a formula may be équivalent to 
a certain single motion, which is the sum of the two; two algebraic 
transformations of a formula may be équivalent to a single transformation, 
that produces the same resuit, and so forth; and so it is possible to speak 
of a characteristic “addition” of motions or transformations. Ail this and 
much else is studied in a general abstract form in contemporary algebra. 

The new algebraic théories in this direction arose in the first half of the 
19th century in the investigations of a number of mathematicians, among 
whom we should particularly mention the French mathematician Galois 
(1811-1832). The concepts, methods, and results of contemporary algebra 
find important applications in analysis, geometry, physics, and crystal- 
lography. In particular, the theory mentioned at the end of §3 concerning 
the symmetry of crystals, which was developed by E. S. Fedorov, is based 
on a union of geometry with one of the new algebraic théories, the so-called 
theory of groups. 

As we see, we are dealing here with a fundamental, qualitative generaliza¬ 
tion of the subject matter of algebra with a change in the very concept of 
what algebra is. The ideas of contemporary algebra and the basic éléments 
of some of its théories will be explained in Chapter XX and XVI. 

4. Analysis. Analysis in ail its branches also made profound progress. 
In the first place, as was already mentioned in the preceding section, 
its foundations were made more précisé; in particular, its basic concepts 
were given exact and general définitions: such concepts as function, limit, 
intégral and finally, the basic concept of a variable magnitude (a rigorous 
définition was given for the real number). A beginning of the process of 
putting analysis on a more précisé foundation was made by the Czech 
mathematician Bolzano (1781-1848), the French mathematician Cauchy 
(1789-1857), and a number of others. This greater précision was gained 
at the same time as the new developments in algebra and geometry were 
being made; it was brought to completion in its présent well-known 
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form in the eighties of the 19th century by the German mathematicians 
Weierstrass, Dedekind, and Cantor. As was mentioned at the end of §6, 
Cantor also laid the foundation for the theory of transfinite sets, which 
plays such a large rôle in the development of the newer ideas in mathe- 
matics. 

The increase in précision in the concepts of variable and function in 
connection with the theory of sets laid the foundation for a further develop¬ 
ment of analysis. A transition was made to the study of more general 
functions; and in the same direction the apparatus of analysis, namely 
the intégral and differential calculus, was also generalized. Thus, on the 
threshold of the présent century, there arose the new branch of analysis 
already mentioned in §6, the so-called theory of functions of a real variable. 
The development of this theory is chiefly connected with the French 
mathematicians, Borel and Lebesgue and others, and with N. N. Luzin 
(1883-1950) and his school. In general, the newer branches of analysis are 
called modem analysis in contradistinction to the earlier so-called classical 
analysis. 

Other new théories arose in analysis. Thus a spécial branch was formed 
by the theory of approximation of functions, which studies questions of 
the best approximate représentation of general functions by various 
“simple” functions, above ail by polynomials, that is by functions of the 
form 

a„x n -(- a,*”- 1 + •" + x + a„ . 

The theory of approximation of functions has great importance, if only 
for the reason that it lays down general foundations for the practical 
calculation of functions, for the approximate replacement of complicated 
functions by simpler ones. The rudiments of this theory go back to the 
very beginnings of analysis. Its modem direction was given to it by the 
great Russian mathematician P. L. CebySev (1821-1894). This direction 
was later developed into the so-called constructive theory of functions, 
chiefly in the works of Soviet mathematicians, particularly S. N. BernSteln 
(born 1880), to whom belong the most important results in this field. 
Chapter XII deals with approximation of functions. 

We spoke earlier about the development of the theory of functions of a 
complex variable. We must still mention the so-called qualitative theory 
of differential équations, originating in the works of Poincaré (1854-1912) 
and A. M. Lyapunov (1857-1918), about which some ideas will be given 
in Chapter V, and also the theory of intégral équations. These théories 
hâve great practical importance in mechanics, physics, and technology. 
Thus, the qualitative theory of differential équations provides solutions of 
problems concerning stability of motion, and the action of mechanisms 
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or of vibrating electric Systems and the like. Stability of a process means 
in the most general sense that if small changes are made in the initial data 
or in the conditions of the motion, then the motion itself during the whole 
of its course will change only slightly. The technical significance of 
questions of this sort hardly needs to be emphasized. 

5. Functional analysis. On the ground prepared by the development 
of analysis and mathematical physics, along with the new ideas of geometry 
and algebra, there has grown up an extensive new division of mathematics, 
the so-called functional analysis, which plays an exceptionally important 
rôle in modem mathematics. Many mathematicians shared in creating it; 
let us mention, for example, the greatest German mathematician of recent 
times, Hilbert (1862-1943), the Hungarian mathematician Riesz (1880- 
1956) and the Polish mathematician Banach (1892-1945). The separate 
Chapter XIX is devoted to functional analysis. 

The essence of this new branch of mathematics consists briefly in the 
following. In classical analysis the variable is a magnitude, or “number,” 
but in functional analysis the function itself is regarded as the variable. 
The properties of the given function are determined here not in themselves 
but in relation to other functions. What is under study is not a separate 
function but a whole collection of functions characterized by one property 
or another; for example, the collection of ail continuous functions. Such 
a collection of functions forms the so-called functional space. This 
procedure corresponds, for example, to the fact that we may consider the 
collection of ail curves on a surface or of ail possible motions of a given 
mechanical System, thereby defining the properties of the separate curves 
or motions in their relation to other curves or motions. 

The transition from the investigation of separate functions to a variable 
function is similar to the transition from unknown numbers x, y to 
variables x, y; that is, it is similar to the idea of Descartes mentioned in a 
preceding paragraph. On the basis of this idea Descartes produced his 
well-known union of algebra and geometry, of an équation and a curve, 
which is one of the most important éléments in the rise of analysis. Simi- 
larly, the union of the concept of a variable function with the ideas of 
contemporary algebra and geometry produced the new functional analysis. 
Just as analysis was necessary for the development of the mechanics of 
the time, so functional analysis provided new methods for the solution 
of présent-day problems of mathematical physics and produced the 
mathematical apparatus for the new quantum mechanics of the atom. 
History repeats itself as usual, but in a new way, on a higher plane. Aswe 
hâve said, functional analysis unités the basic ideasand methods of analysis, 
of modem algebra, and of geometry and in its turn exercises an influence on 
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the development of these branches of mathematics. The problems arising 
in classical analysis now find new, more general solutions, often almost at 
a single step, by means of functional analysis. Here, as at a focus, are 
gathered together, in a very productive way, the most general and abstract 
ideas of modem mathematics. 

From this short sketch, from this mere énumération of the new directions 
of analysis(the theory of functions of a real variable, theory of approxima¬ 
tion of functions, qualitative theory of differential équations, theory of in¬ 
tégral équations, and functional analysis) it may be seen that we are dealing 
here in fact with an essentially new stage in the development of analysis. 

6. Computational mathematics and mathematical logic. At ail periods 
the technical level of the means of computation has had an essential 
influence on mathematical methods. But the equipment for carrying out 
calculations which has been at our disposai up until most recent times has 
been very limited. The simplest devices, such as the abacus, tables of 
logarithms and the logarithmic sliderule, the calculating machine, and 
finally more complicated calculators and the automatic calculating 
machine, these were the basic implements for computation existing up to 
the forties of the 20th century. These implements made it possible to carry 
out more or less quickly the separate operations of addition, multiplication, 
and so forth. But to carry through to final numerical resuit the practical 
problems that arise nowadays requires a colossal number of such opera¬ 
tions, following one another in a complicated program that sometimes 
dépends on results obtained during the course of the calculation. The 
solution pf such problems proved to be practically impossible or complete- 
ly valueless on account of the length of the process of solution. But in the 
last ten years a radical change has taken place in the whole science of 
computation. Modem calculating machines, constructed on new principles, 
allow us to make computations with exceptionally great speed and at the 
same time to carry out complicated chains of calculations automatically, 
according to extremely flexible programs arranged in advance. Some of 
the questions connected with the construction and significance of modem 
calculating machines will be discussed in Chapter XIV. 

The new techniques not only enable us to carry out investigations that 
were formerly quite impracticable but also lead us to change our estimate 
of the value of many well-known mathematical results. For example, they 
hâve given a spécial stimulus to the development of approximative 
methods; that is, methods which allow us, by a chain of elementary 
operations, to reach a desired numerical resuit with sufficiently great 
accuracy. The mathematical methods themselves must now be estimated 
from the point of view of their suitability for corresponding machines. 



62 


I. A GENERAL VIEW O F MATHEMATICS 


ln close connection with the development of calculating techniques is 
the subject of mathematical logic. It was developed primarily as a resuit 
of intrinsic difficultés arising in mathematics itself, its subject matter 
being the analysis of mathematical proof. It is itself a branch of mathe¬ 
matics, and includes those branches of general logic that can be objectively 
formulated and developed by the mathematical method. 

Although on the one hand mathematical logic thus goes back to the very 
sources and foundations of mathematics, it is closely connected, on the 
other hand, with the most modem questions of computational technique. 
Naturally, for example, a proof that leads to the setting up of a definite 
preassigned process, permitting us to approach a desired resuit with an 
arbitrary degree of accuracy, is essentially different from more abstract 
proofs on the existence of the given resuit. 

There also arises here a characteristic range of questions concerning 
the degree of generality possible in problems that can be dealt with by a 
method which is completely defined in advance at every step. Profound 
results hâve been reached along these Unes in mathematical logic, results 
that are extremely important from a general epistemological point of view. 

It would not be an exaggeration to say that with the development of the 
new computational techniques and the achievements of mathematical 
logic a new period has begun in modem mathematics, characterized by 
the fact that its subject matter is not only the study of one object or 
another but also ail the ways and means by which such an object can be 
defined; not only certain problems, but also ail possible methods of 
solving them. 

To what has been said it is only necessary to add that also in the older 
branches of mathematics, the theory of numbers, Euclidean geometry, 
classical algebra and analysis, and the theory of probability, rapid 
development has continued throughout the whole period of modem 
mathematics so that these fields hâve been enriched by many new funda- 
mental ideas and results; let us mention, for example, the results attained 
in the theory of numbers and in the geometry of everyday space by the 
Russian and Soviet mathematicians P. L. CebySev, E. S. Fedorov, I. M. 
Vinogradov, and others. The development on a wide front of the theory 
of probability has been connected with the extraordinarily important 
regularities observable in statistical physics and in contemporary problems 
of technology. 

7. Characteristic features of modem mathematics. What are the most 
general characteristics of modem mathematics as a whole, distinguishing 
it from the earlier development of geometry, algebra, and analysis? 

First of ail is the immense extension of the subject matter of mathematics 
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and of its applications. Such an extension of subject matter and range of 
application represents an enormous quantitative and qualitative growth, 
brought about by the appearance of powerful new théories and methods 
that allow us to solve problems completely inaccessible up to now. This 
extension of the subject matter of mathematics is characterized by the fact 
that contemporary mathematics conscientiously sets itself the task of 
studying ail possible types of quantitative relationships and spatial 
forms. 

A second characteristic feature of modem mathematics is the formation 
of general concepts on a new and higher level of abstraction. It is precisely 
this feature that guarantees préservation of the unity of mathematics, 
in spite of its immense growth in widely differing branches. Even in parts 
of mathematics that are extremely far from one another similarities of 
structure are brought to light by the general concepts and théories of the 
présent day. They guarantee that contemporary mathematical methods 
will hâve great generality and breath of application; in particular, they 
produce a profound interpénétration of the fundamental branches of 
mathematics: geometry, algebra, and analysis. 

As one of the characteristic features of modem mathematics, we must 
also mention the obvious dominance of the set-theoretical point of view. 
Of course, this point of view owes its significance to the fact that it 
summarizes in a certain sense the rich content of ail the preceding develop- 
ments of mathematics. Finally, one of the most characteristic features of 
modem mathematics is the profound analysis of its foundations, of the 
mutual influence of its concepts, of the structure of its separate théories, 
and of the methods of mathematical proof. Without such an analysis of 
foundations it would not be possible to improve or develop any further 
the principles and théories that hâve led to the présent generalizations. 

The characteristic feature of modem mathematics may be said to be 
that its subject matter consists not only of given quantitative relations 
and forms but of ail possible ones. In geometry, we speak not only of 
spatial relations and forms but of ail possible forms similar to spatial 
ones. In algebra, we speak of various abstract Systems of objects with ail 
possible laws of operation on them. In analysis, not only magnitudes are 
considered as variables but the very functions themselves. In a functional 
space ail the functions of a given type (ail the possible interdependences 
among the variables) are brought together. Summing up, it is possible to 
say that while elementary mathematics deals with constant magnitudes, 
and the next period with variable magnitudes, contemporary mathematics 
is the mathematics of ail possible (in general, variable) quantitative relations 
and interdependences among magnitudes. This définition is, of course, 
incomplète, but it does emphasize the characteristic feature of modem 
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mathematics which distinguishes it from the mathematics of preceding 
âges.* 


Suggested Reading 
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* This section is followed in the original Russian text by two sections entitled “The 
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Thèse sections are omitted in the présent translation in view of the fact that they 
discuss in more detail, and in the more general philosophical setting of dialectical 
materialism, points of view already stated with great clarity in the preceding sections. 



CHAPTER 


II 


ANALYSIS 


§1. Introduction 

The rise at the end of the Middle Ages of new conditions of manufacture 
in Europe, namely the birth of capitalism, which at this time was replacing 
the feudal System, was accompanied by important geographical discoveries 
and explorations. In 1492, relying on the idea that the earth is spherical, 
Columbus discovered the New World. The discovery by Columbus 
greatly extended the boundaries of the known world and produced a 
révolution in the minds of men. The end of the 15th century and the 
beginning of the 16th saw the Creative activity of the great artist-humanists 
Leonardo da Vinci, Raphaël, and Michelangelo, which gave new meaning 
to art. In 1543 Copernicus published his work “On the révolution of the 
heavenly bodies,” which completely changed the face of astronomy; 
in 1609 appeared the “New astronomy” of Kepler, containing his first 
and second laws for the motion of the planets around the sun, and in 
1618 his book “Harmony of the world,” containing the third law. Galileo, 
on the basis of his study of the works of Archimedes and his own bold 
experiments, laid the foundations for the new mechanics, an indispensable 
science for the newly arising technology. In 1609 Galileo directed his 
recently constructed telescope, though still small and imperfect, toward 
the night sky; the first glance in a telescope was enough to destroy the 
idéal celestial spheres of Aristotle and the dogma of the perfect form of 
celestial bodies. The surface of the moon was seen to be covered with 
mountains and pitted with craters. Venus displayed phases like the 
Moon, Jupiter was surrounded by four satellites and provided a miniature 
Visual model of the solar System. The Milky Way fell apart into separate 
stars, and for the first time men felt the staggeringly immense distance 
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of the stars. No other scientific discovery has ever made such an im¬ 
pression on the civilized world.* 

The further development of navigation, and consequently of astronomy, 
and also the new development of technology and mechanics necessitated 
the study of many new mathematical problems. The novelty of these 
problems consisted chiefly in the fact that they required mathematical 
study of the laws of motion in a broad sense of the word. 

The State of rest and motionlessness is unknown in nature. The whole 
of nature, from the smallest particles up to the most massive bodies, 
is in a State of eternal création and annihilation, in a perpétuai flux, in 
unceasing motion and change. In the final analysis, every natural science 
studies some aspect of this motion. Mathematical analysis is that branch 
of mathematics that provides methods for the quantitative investigation 
of various processes of change, motion, and dependence of one magnitude 
on another. So it naturally arose in a period when the development of 
mechanics and astronomy, brought to life by questions of technology 
and navigation, had already produced a considérable accumulation of 
observations, measurements, and hypothèses and was leading science 
straight toward quantitative investigation of the simplest forms of motion. 

The name “infinitésimal analysis” says nothing about the subject 
matter under discussion but emphasizes the method. We are dealing here 
with the spécial mathematical method of infinitesimals, or in its modem 
form, of limits. We now give some typical examples of arguments which 
make use of the method of limits and in one of the later sections we will 
define the necessary concepts. 

Example I. As was established experimentally by Galileo, the distance 
s covered in the time t by a body falling freely in a vacuum is expressed 
by the formula 



(g is a constant equal to 9.81 m/sec 2 ).t What is the velocity of the falling 
body at each point in its path ? 

Let the body be passing through the point A at the time t and consider 
what happens in the short interval of time of length At; that is, in the time 
from t to / 4- At. The distance covered will be increased by a certain 

* This section is based on the beautiful essay of Academician S. I. Vavilov "Galileo” 
(Great Soviet Encyclopedia, Volume 10, 1952). 

t Nowadays formula (1) is deduced from the general laws of mechanics, but historic- 
ally it was just this formula which, after being established experimentally by Galileo, 
scrved as a part of the accumulation of expérience that was subsequently generalized 
by those laws. 
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incrément As. The original distance is s, = gt 2 /2; the increased distance 
is 


s 2 = 


g(t + Ai) 2 gt 2 g 


= si- + 2 (2 tAt + Al 2 ). 


From this we find the incrément 

As = i 2 -s, = j (2 tAt + Ai 2 ). 


This represents the distance covered in the time from i to / + Ai. To 
find the average velocity over the section of the path As, we divide As 
by Ai: 

As , g . 

l ’*v = Â7 = gl + ï At - 


Letting Ai approach zéro we obtain a n average velocity which approaches 
as close as we like to the true velocity at the point A. On the other hand, 
we see that the second summand on the right-hand side of the équation 
becomes vanishingly small with decreasing At, so that the average t>» v 
approaches the value gt, a fact 
which it is convenant to write 
as follows: 


lim v» v 

Jf-0 


.. As 
lim ~r 
d<-0 J/ 


= ik(*' + i 4 ')- 


--— - -J- J » O"-- 

velocity at the time t. 



Example 2. A réservoir 

with a square base of side a Fig. 1. 

and vertical walls of height h 

is full to the top with water (figure 1). With what force is the water acting 
on one of the walls of the réservoir ? 

We divide the surface of the wall into n horizontal strips of height h/n. 
The pressure exerted at each point of the vessel is equal, by a well-known 
law, to the weight of the column of water lying above it. So at the lower 
edge of each of the strips the pressure, expressed in suitable units, will be 
equal respectively to 


- — — (w — 1) h ^ 
n’ n ’ n ' ' n 


68 


II. ANALYSIS 


We obtain an approximate expression for the desired force P, if we 
assume that the pressure is constant over each strip. Thus the approximate 
value of fis equal to 


„ ah h ah 2 h 

fa» — ’ - H-•-h 

n n n n 


, ah (n — 1 ) h ah , 

- 1 - h 

n n n 


--7J-O +2 + •••+ „) = £jL 


n(n + 1) ah 2 




To find the true value of the force, we divide the side into narrower and 
narrower strips, increasing n without limit. With increasing n the magnitude 
l/n in the above formula will become smaller and smaller and in the limit 
we obtain the exact formula 


f = 


afP 
2 ‘ 


The idea of the method of limits is simple and amounts to the following. 
In order to détermine the exact value of a certain magnitude, we first 
détermine not the magnitude itself but some approximation to it. How- 
ever, we make not one approximation but a whole sériés of them, each 
more accurate than the last. Then from examination of this chain of 
approximations, that is from examination of the process of approximation 
itself, we uniquely détermine the exact value of the magnitude. By this 
method, which is in essence a profoundly dialectical one, we obtain a 
fixed constant as the resuit of a process or motion. 

The mathematical method of limits was worked as the resuit of the 
persistent labor of many générations on problems that could not be 
solved by the simple methods of arithmetic, algebra, and elementary 
geometry. 

What were the problems whose solution led to the fundamental concepts 
of analysis, and what were the methods of solution that were set up for 
these problems ? Let us examine some of them. 

The mathematicians of the 17th century gradually discovered that a 
large number of problems arising from various kinds of motion with 
conséquent dependence of certain variables on others, and also from 
géométrie problems which had not yielded to former methods, could 
be reduced to two types. Simple examples of problems of the first type 
are: find the velocity at any time of a given nonuniform motion (or more 
generally, find the rate of change of a given magnitude), and draw a 
tangent to a given curve. These problems (our first example is one of 
them) led to a branch of analysis that received the name “differential 
calculus.” The simplest examples of the second type of problem are: 
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find the area of a curvilinear figure (the problem of quadrature), or the 
distance traversed in a nonuniform motion, or more generally the total 
effect of the action of a continuously changing magnitude (compare the 
second of our two examples). This group of problems led to another 
branch of analysis, the “intégral calculus.” Thus two fundamental 
problems were singled out: the problem of tangents and the problem of 
quadratures. 

In this chapter we will describe in detail the underlying ideas of the 
solution of these two problems. Particularly important here is the theorem 
of Newton and Leibnitz to the effect that the problem of quadratures is 
the inverse, in a well-known sense, of the problem of tangents. For solving 
the problem of tangents, and problems that can be reduced to it, there was 
worked out a suitable algorithm, a completely general method leading 
directly to the solution, namely the method of dérivatives or of différentia¬ 
tion. 

The history of the création and development of analysis and of the rôle 
played in its growth by the analytic geometry of Descartes has already 
been described in Chapter I. We see that in the second half of the 17th 
century and the first half of the 18th a complété change took place in the 
whole of mathematics. To the divisions that already existed, arithmetic, 
elementary geometry, and the rudiments of algebra and trigonometry, 
were added such general methods as analytic geometry, differential and 
intégral calculus, and the theory of the simplest differential équations. 
It was now possible to solve problems whose solutions up to now had 
been quite inaccessible. 

It turned out that if the law for the formation of a given curve is not 
too complicated, then it is always possible to construct a tangent to it at 
an arbitrary point; it is only necessary to calculate, with the help of the 
rules of differential calculus, the so-called dérivative, which in most cases 
requires a very short time. Up till then it had been possible to draw 
tangents only to the circle and to one or two other curves, and no one had 
suspected the existence of a general solution of the problem. 

If we know the distance traversed by a moving point up to any desired 
instant of time, then by the same method we can at once find the velocity 
of the point at a given moment, and also its accélération. Conversely, 
from the accélération it is possible to find the velocity and the distance, 
by making use of the inverse of différentiation, namely intégration. As a 
resuit, it was not very difficult, for example, to prove from the Newtonian 
laws of motion and the law of universal gravitation that the planets must 
move around the sun in ellipses according to the laws of Kepler. 

Of the greatest importance in practical life is the problem of the greatest 
and least values of a magnitude, the so-called problem of maxima and 
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minima. Let us take an example: From a log of wood with circular cross 
section of given radius we wish to eut a beam of rectangular cross section 
such that it will offer the greatest résistance to bending. What should be 
the ratio of the sides ? A short argument on the stiffness of beams of 
rectangular cross section (applying simple concepts from the intégral 
calculus), followed by the solving of a maximum problem (which involves 
calculating a dérivative) provides the answer that the greatest stiffness 
is produced for a rectangular cross section whose height is in the ratio 
to its base of y/l: 1. The problems of maxima and minima are solvedas 
simply as those of drawing tangents. 

At various points of a curved line, if it is not a straight line or a circle, 
the curvature is in general different. How can we calculate the radius 
of a circle with the same curvature as the given line at the given point, 
the so-called radius of curvature of the curve at the point? It turns out 
that this is equally simple; it is only necessary to apply the operation of 
différentiation twice. The radius of curvature plays a great rôle in many 
questions of mechanics. 

Before the invention of the new methods of calculation, it had been 
possible to find the area only of polygons, of the circle, of a sector or a 
segment of the circle, and of two or three other figures. In addition, 
Archimedes had already invented a way to calculate the area of a segment 
of a parabola. The extremely ingenious method which he used in this 
problem was based on spécial properties of the parabola and consequently 
gave rise to the idea that every new problem in the calculation of area 
would very likely require its own methods of investigation, even more 
ingenious and difficult than those of Archimedes. So mathematicians 
were greatly pleased when it turned out that the theorem of Newton and 
Leibnitz, to the effect that the inversion of the problem of tangents would 
solve the problem of quadrature, at one provided a method of calculating 
the areas bounded by curves of widely different kinds. It became clear 
that a general method exists, which is suitable for an infinité number of the 
most different figures. The same remark is true for the calculation of 
volumes, surfaces, the lengths of curves, the mass of inhomogeneous 
bodies, and so forth. 

The new method accomplished even more in mechanics. It seemed 
that there was no problem in mechanics that the new calculations would 
not clarify and solve. 

Not long before, Pascal had explained the increase in the size of the 
Torricelli vacuum with increasing altitude as a conséquence of the decrease 
in atmospheric pressure. But exactly what is the law governing this 
decrease? The question is answered immediately by the investigation 
of a simple differential équation. 
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It is well known to sailors that they should take two or three turns of 
the mooring cable around the capstan if one man is to be able to keep 
a large vessel at its mooring. Why is this? It turned out that from a 
mathematical point of view the problem is almost completely identical 
with the preceding one and can be solved at once. 

Thus, after the création of analysis, there followed a period of tempestu- 
ous development of its applications to the most varied branches of tech- 
nology and natural science. Since it is founded on abstraction from the 
spécial features of particular problems, mathematical analysis reflects 
the actual deep-lying properties of the material world; and this is the 
reason why it provides the means for investigation of such a wide range 
of practical questions. The mechanical motion of solid bodies, the motion 
of liquids and gases of their particular particles, their laws of flow in the 
mass, the conduction of heat and electricity, the course of Chemical 
reactions, ail these phenomena are studied in the corresponding sciences 
by means of mathematical analysis. 

At the same time as its applications were being extended, the subject 
of analysis itself was being immeasurably enriched by the création and 
development of various new branches, such as the theory of sériés, applica¬ 
tions of geometry to analysis, and the theory of differential équations. 

Among mathematicians of the I8th century, there was a widespread 
opinion that any problem of the natural sciences, provided only that one 
could find a correct mathematical description of it, could be solved by 
means of analytic geometry and the differential and intégral calculus. 

Mathematicians proceeded gradually to more complicated problems 
of natural science and technology, which demanded further development 
of their methods. For the solution of such problems it became necessary 
to create further branches of mathematics: the calculus of variations, 
the theory of functions of a complex variable, field theory, intégral 
équations, and functional analysis. But ail these new methods of cal¬ 
culation were essentially immédiate extensions and generalizations of 
the remarkable methods discovered in the 17th century. The greatest 
mathematicians of the 18th century, David Bernoulli (1700-1782), 
Leonard Euler (1707-1783) and Lagrange (1736-1813), who blazed new 
paths in science, constantly took as their starting point the fundamental 
problems of the exact sciences. This energetic development of analysis 
was continued into the I9th century by such famous mathematicians as 
Gauss (1777-1855), Cauchy (1789-1857), M V. Ostrogradskil (1801-1861), 
P. L. CebySev (1821-1894), Riemann (1826-1866), Abel (1802-1829), 
Weierstrass (1815-1897), ail of whom made truly remarkable contribu¬ 
tions to the development of mathematical analysis. 

The Russian mathematical genius, N. L Lobaôevskil, had an influence 
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on the development of certain questions of mathematical analysis, and we 
should also mention the leading mathematicians who were active at the 
turn of the 20th century: A. A. Markov (1856-1922), A. M. Lyapunov 
(1857-1918), H. Poincaré (1854-1912), F. Klein (1849-1925), D. Hilbert 
(1862-1943). 

The second half of the 19th century witnessed a profound critical 
examination and clarification of the foundations of analysis. The various 
powerful methods that had accumulated were now put on a uniform 
systematic basis, corresponding to the advanced level of mathematical 
rigor. Ail these methods are the means by which, along with arithmetic, 
algebra, geometry and trigonometry, we give a mathematical interpréta¬ 
tion to the world around us, describe the course of actual events, and 
solve the important practical problems connected with them. 

Analytic geometry, differential and intégral calculus, and the theory 
of differential équations are studied at ail technical institutes, so that these 
branches of mathematics are known to millions of citizens; the éléments 
of these sciences are also taught at many technical schools; there is also 
some question of their being introduced into the secondary schools. 

In most recent times the general use of rapid calculating machines has 
introduced a new era in mathematics. These machines, in conjunction 
with the branches of mathematics just mentioned, open up strange new 
possibilities for mankind. 

At the présent time, analysis and the branches arisingfrom it represent 
a widely diversified mathematical science, consisting of several broad 
independent disciplines closely connected with one another; each of these 
disciplines is being developed and perfected. 

More than ever before, a significant rôle is being played in analysis by 
the requirements of daily life, by problems connected with the imposing 
development of technology. Of great importance are the aerodynamical 
problems of hypersonic velocities, which are being solved with constant 
success. The most difficult problems of mathematical physics hâve now 
reached the stage where they can be solved in practical numerical form. 
In contemporary physics such théories as quantum mechanics (which 
studies the problems peculiar to the microcosm of the atom) not only 
require the most advanced branches of contemporary mathematical 
analysis for solving their problems but could not even describe their 
fundamental concepts without the use of analysis. 

The purpose of the présent chapter is to give a popular présentation, 
suitable for a reader acquainted only with elementary mathematics, 
of the growth and the simplest applications of such basic concepts of 
analysis as function, limit, dérivative, and intégral. Since the various 
spécial branches of analysis will be dealt with in other chapters of the 
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book, the présent chapter has a more elementary character and a reader 
who has already studied a usual first course in analysis may omit it 
without harm to his understanding of the rest of the book. 

§2. Function 

The concept of a function. The various objects or phenomena that 
we observe in nature are organically connected with one another; they 
are interdependent. The simplest relations of this sort hâve long been 
known to mankind and information about them has been accumulated 
and formulated as physical laws. These laws indicate that the various 
magnitudes characterizing a given phenomenon are so closely related 
to one another that some of them are completely determined by the values 
of others. For example, the length of the sides of a rectangle completely 
détermine its area, the volume of a given amount of gas at a given tem¬ 
pérature is determined by the pressure, and the élongation of a given 
metallic rod is determined by its température. It was uniformities of this 
sort that served as the origin of the concept of function. 

Consider an algebraic formula which, corresponding to each value 
of the literal magnitudes occurring in it, allows us to find the value of the 
magnitude expressed by the formula; the basic idea here is that of a 
function. Let us consider some examples of functions expressed by such 
formulas. 

1. Let us suppose that at the beginning of a certain period of time a 
material point was at rest and that subsequently it began to fall as the 
resuit of gravity. Then the distance s traced out by the point up to time 
t is expressed by the formula 



where g is the accélération of gravity. 


-*-l x t-*- 




Fig. 2. 
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2. From a square of side a we construct an open rectangular box of 
height x (figure 2). The volume V of the box is calculated from the formula 

( 2 ) 

Formula (2) allows us, 
for every height x under 
the obvious restriction 
0 < x < a/2, to find the 
volume of the box. 

3. Let a pillar (figure 3) 
be erected at the center 
of a circular skating rink 
with a light at height h. 
The illumination T at the 
edge of the circle may be 
expressed by the formula 

(3) 

where r is the radius of the circle, tan a = h/r, and A is a certain magnitude 
characterizing the power of the light. If we know the height h we can 
calculate T from formula (3). 

4. The roots of the quadratic équation 

x* + px - 1 = 0 (4) 

are given by the formula 

^ = -f ± V1+T- (5) 

The characteristic feature of a formula in general, and of the examples 
just given in particular, is that the formula enables us, for any given value 
of one of the variables (the time t, the height x of the box, the height h 
of the pilar, the coefficient p of the quadratic équation), which is called 
the independent variable, to calculate the value of the other variable 
(the distance s, the volume V, the illumination T , the root x of the équa¬ 
tion), which is called a dépendent variable or a function of the first 
variable. 

Each of the formulas introduced provides an example of a function: 
the distance 5 traced by the point is a function to the time t; the volume 


V = x(a — 2x)*. 



Fig. 3. 

_ A sin a 
1 ~ h 2 + r 2 
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V of the box is a function of height x\ the illumination T of the edge of the 
rink is a function of the height h of the pillar; the two roots of the quadratic 
équation (4) are functions of the coefficient p. 

It should be remarked that in some cases the independent variable may 
assume any desired numerical value, as in example 4 where the coefficient 
p of the quadratic équation (4) may be an arbitrary number. In other cases 
the independent variable may take an arbitrary value from some set 
(or collection) of numbers determined in advance; as in example 2, where 
the volume of the box is a function of its height x, which can take any 
value from the set of numbers x satisfying the inequality 0 < x < a/2. 
Similarly, in example 3 the illumination T at the edge of the rink is a 
function of the height h of the pillar, which theoretically can take any 
value satisfying the inequality h > 0, but in practice h must satisfy the 
inequalities 0 < h < H, where the magnitude H is determined by the 
technical facilities at the disposai of the administration of the rink. 

Let us introduce other examples of this kind. The formula 


y ---- Vl - x* 

détermines a real function (expressing a relationship between the real 
numbers x and y) only for those values of x which satisfy the inequalities 
— 1 ^ x ^ + 1, and the formula y = log (1 — x 2 ) only for those x 
which satisfy the inequalities —I < x < 1. 

So it is necessary to take account of the fact that actual functions may 
not be defined for ail numerical values of the independent variable but 
only for those values which belong to a certain set, which most often 
fills out an interval on the x-axis, with or without the end points. 

We are now in the position to give the définition of a function accepted 
in present-day mathematics. 

The (dépendent ) magnitude y is a function of the (independent ) magnitude 
x if there exists a rule whereby to each value of x belonging to a certain 
set of numbers there corresponds a definite value of y. 

The set of values x appearing in this définition is called the domain 
of the function. 

Every new concept gives rise to a new symbolism. The transition from 
arithmetic to algebra was made possible by the construction of formulas 
which were valid for arbitrary numbers, and the search for general 
solutions gave rise to the literal symbolism of algebra. 

The problem of analysis is the study of functions, that is of the depend- 
ence of one variable on another. Consequently, just as in algebra a transi¬ 
tion took place from concrète numbers to arbitrary numbers, denoted 
by letters, so in analysis there was the corresponding transition from 
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concrète formulas to arbitrary formulas. The phrase “y is a function of x” 
is conventionally written as 

y =/(*)■ 

Just as in algebra different letters are used for different numbers, so in 
analysis different notations are used for different types of dependence, 
that is for different functions: thus we write y = F(x), y = <f>(x), 

Graphs of functions. One of the most fruitful and brilliant ideas of the 
second half of the 17th century was the idea of the connection between 
the concept of a function and the géométrie représentation of a line. 
This connection can be realized, for example, by means of a rectangular 
Cartesian System of coordinates, with which the reader is certainly familiar 
in a general way from his secondary school mathematics. 

Let us set up on the plane a rectangular Cartesian System of coordinates. 
This means that on the plane we choose two mutually perpendicular lines 
(the axis of abscissas and the axis of ordinates), on each of which we fix 
a positive direction. Then to each point M of the plane we may assign 
two numbers (x, y), which are its coordinates, expressing in the given 
System of measurement the distance, taken with the proper sign,* of the 
point M from the axis of ordinates and the axis of abscissas respectively. 

With such a System of coordinates we may represent functions graphic- 
ally in the form of certain lines. Suppose we are given a function 

y =/(*)• ( 6 ) 

This means, as we know, that for every value of x belonging to the domain 
of définition of the given function, it is possible to détermine by some 
means, for example by calculation, a corresponding value y. Let us give 
to x ail possible numerical values, for each x détermine y according to our 
rule (6), and construct on the plane the point with coordinates x and y. 
In this way, for every point M 'on the x-axis (figure 4) there will correspond 



The number x is the abscissa and y is the ordinate of the point M. 
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a point M with coordinates x and y = f(x). The set of ail points M forms 
a certain line, which we call the graph of the function y = f\x). 

Thus, the graph of the function f(x) is the géométrie locus of the points 
whose coordinates satisfy équation (6). 

In school we became acquainted with the graphs of the simplest func- 
tions. Thus the reader probably knows that the function y = kx + b, 
where k and b are constants, is the graph (figure 5) of a straight line 
forming the angle a with the positive direction of the *-axis, where 
tan a = k, and intersecting the y-axis at the point (0, b). This function is 
called a linear function. 

Linear functions occur very frequently in the applications. Let us recall 
that many physical laws are represented, with considérable accuracy, 
by linear functions. For example, the length / of a body may be considered 
with good approximation as a linear function of its température 

/ = / 0 + <xl 0 t, 

where a is the coefficient of linear expansion, and / 0 is the length of the 
body for t = 0. If x is the time and y is the distance covered by a moving 
point, then the linear function y = kx + b obviously expresses the fact 
that the point is moving with uniform velocity k; and the number b 
dénotés the distance, at time x 0 = 0 , of the moving point from the fixed 
zéro-point from which we measure our distances. Linear functions are 
extremely useful because of their simplicity and because it is possible to 
consider nonuniform changes as being approximately linear, even if only 
for small intervals. 

But in many cases it is necessary to make use of nonlinear functional 
dependence. Let us recall for example the law of Boyle-Mariotte 


c 



where the magnitudes p and v are inversely proportional. The graph of 
such a relation represents a hyperbola (figure 6). 

The physical law of Boyle-Mariotte corresponds actually to the case 
that p and v are positive; it represents a branch of the hyperbola lying 
in the first quadrant. 

The general class of oscillatory processes includes periodic motions, 
which are usually described by the familiar trigonométrie functions. 
For example, ifwe extend a hanging spring from its position of equilibrium, 
then, so long as we stay within the elastic limits of the spring, the point 
A will perform vertical oscillations which are quite accurately expressed 
by the law 


x = a cos (pt + a). 
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where x is the displacement of the point A from its position of equilibrium, 
t is the time, and the numbers a , p and a are certain constants determined 
by the material, the dimensions, and the initial extension of the spring. 

It should be kept in mind that a function may be defined in various 
domains by various formulas, determined by the circumstances of the 



Fio. 6. Fig. 7. 


case. For example, the relation Q = /(/) between the température t of a 
gram of water (or ice) and the quantity of heat Q in it, as t varies between 
— 10° and + 10°, is a completely determined function which it is difficult 
to express in a single formula,* but it is easy to represent this function 
by two formulas. Since the spécifie heat of ice is equal to 0.5 and that of 
water is equal to 1, this function, if we agréé that Q = 0 at —10°, is 
represented by the formula 

Q = 0.5/ + 5, 

as / varies in the interval —10° < / < 0° and by another formula 

Q = t + 85, 

as / varies in the interval 0° < i < 10°. For / = 0 this function is inde- 
finite or multiple-valued; for convenience, we may agréé that at / = 0 
it takes some well-defined value, for example/(0) = 45. The graph of the 
function Q = f(t) is given in figure 7. 


* This does not mean that such an expression is impossible, [n Chapter XI1 we will 
show how to obtain a single formula. 
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We hâve introduced many examples of functions given by formulas. 
The possibility of representing a function by means of formulas is extremely 
important from the mathematical point of view, since such formulas 
provide very favorable conditions for investigating the properties of the 
functions by mathematical methods. 

But one must not think that a formula is the only method of defining a 
function. There are many other methods; for example, the graph of the 
function, which gives a visual géométrie picture of it. The following 
example gives a good illustration of another method. 

To record variation of the température of the air during the course 
of 24 hours, meteorological stations make use of an instrument called 
the thermograph. A thermograph consists of a drum rotated about its 
axis by a clockwork mechanism, and of a curved brass framework that 
is extremely sensitive to changes of température. As a resuit, a pen fastened 
to the framework by a System of levers rises with rising température; 
and conversely, a fall in the température lowers the pen. On the drum is 
wound a ribbon of graph paper, on which the pen draws a continuous 
line, forming the graph of the function T = A0< which expresses the 
interdependence of the time and the température of the air. From this 
graph we may détermine, without calculation, the value of the température 
at any moment of time l. 

This example shows that a graph in itself détermines a function in- 
dependently of whether the function is given by a formula or not. 

Incidentally, we shall return to this question (see Chapter XII) 
and shall prove the following important assertion: Every continuous 
graph can be represented by a formula, or, as it is still customary to say, 
by an analytic expression. This statement is also true for many discontinu¬ 
ons graphs* 

We remark that the truth of this statement, which is of great theoretical 
importance, was completely realized in mathematics only in the middle 
of the past century. Up to that time mathematicians understood by the 
term “function” only an analytic expression (formula). But they were 
under the mistaken impression that many discontinuous graphs did not 
correspond to any analytic expression, since they assumed that if a function 
was given by a formula, then its graph must possess certain particularly 
désirable properties in comparison with the other graphs. 

But in the 19th century, it was discovered that every continuous graph 
may be represented by a more or less complicated formula. Thus the 
exceptional rôle of the analytic expression as a means of définition of 

* Of course, the above statement will be completely clear to the reader only after 
we hâve given a précisé définition of exactly what is meant in mathematics by the term 
“formula” and "analytic expression." 
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functions was weakened and there came into existence the new, more 
flexible définition given above for the concept of a function. By this 
définition a variable y is called a function of a variable x if there exists 
a rule whereby to every value of x in the domain of définition of the 
function there corresponds a completely determined value y, independent 
of the way in which this rule is given: by a formula, a graph, a table or 
in any other way. 

We may remark here that in the mathematical literature the above 
définition of a function is often associated with the name of Dirichlet, 
but it is worth emphasizing that this définition was given simultaneously 
and independently by N. I. Lobacevskil. Finally we suggest as an exercise 
that the reader sketch the graphs of the functions x *, \/x, sin x, sin 2x, 
sin (x + tt/ 4), ln x, ln(l + x), \x — 3|, (x + | x |)/2. 

We should also note that the graph of a function which for ail values 
of x satisfies the relation 

Â-x) =Ax) 

is symmetric with respect to the y-axis and in the case 

Â-x) = -Rx) 

the graph is symmetric with respect to the origin of coordinates. Consider 
also how to obtain the graph of a function /(a + x), when a is a constant, 
from the graph of J\x). Finally, consider how, using the graphs of the 
functions /(x) and <f>(x), it is possible to find the values of the composite 
function y — /[<£(*)]. 

§3. Limits 

In §1 it was stated that modem mathematical analysis uses a spécial 
method, which was worked out in the course of many centuries and serves 
now as its basic instrument. We are speaking here of the method of 
infinitesimals, or, as is essentially the same, of limits. We shall try to give 
some idea of these concepts. For this purpose we consider the following 
example. 

We wish to calculate the area bounded by the parabola with équation 
y = x 2 , by the x-axis and by the straight line x = 1 (figure 8). Elementary 
mathematics will not furnish us with a means for solving this problem. 
But here is how we may proceed. 

We divide the interval [0, 1] along the x-axis into n equal parts at the 
points 

„ 1 2 


n — 1 
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and on each of these parts construct the rectangle whose left side extends 
up to the parabola. As a resuit we obtain the system of rectangles shaded 
in figure 8, the sum S n of whose areas is given by 



l 2 + 2 2 + -+(/. - I)* _ (n - 1) n{2n - 1) . 
n 3 6/I 3 


Let us express S n in the following form: 



The quantity a„, which 
dépends on n, is admittedly 
rather unwieldy in appear- 
ance, but it possesses a 
certain remarkable property: 
If n is increased beyond 
ail bounds, then <*„ 
approaches 0. This property 
may also be expressed as 
follows: If we are given an 
arbitrary positive number 
e, then it is possible to 
choose an integer N suffi- 
ciently large that for ail n 
greater than N number a„ 
will be less than the given 
e in absolute value.t 



Fig. 8. 


* If in the obvious equalities {k + I)’- Ar* = 3*’ + 3* + 1, for the different values 
k = 1, 2, •••, n - I, we add the left and right sides scparately, we obtain the équation 

= 3 o . + i<l r i^ + „- 1 

where = 1*4-2’+ ••• + (n - I)*. Solving this équation for o„, we get 

(n — l)n(2« — 1) 

= 6 ' 

t For example, if e = 0.001, we may take N = 500. In fact, since 


for positive intigral n, therefore 

1 I 

I a_ | = -- 

6 n* 2n 


1 J_ 
6n* 2 n 


= -î--L < _L < 0.001 

2 n 6n* 2 n 
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The magnitude <*„ is an example of an infinitésimal in the sense in which 
that word is used in modem mathematics. 

In figure 8 we see that if we increase the number n beyond ail bounds, 
the sum S„ of the areas of the shaded rectangles will approach the desired 
area of the curvilinear figure. On the other hand, équation (7), in view 
of the fact that ac n approaches zéro as n increases beyond ail bounds, 
shows that the sum S n at the same time approaches 1/3. From this it 
follows that the desired area 5 of the figure is equal to 1/3, and we hâve 
solved our problem. 

So the method under discussion amounts to this, that in order to find 
a certain magnitude 5 we introduce another magnitude S n , a variable 
magnitude which approaches zéro through particular values 5,, S 2 , 
S 3 , •••, which dépend according to some law on the natural numbers 
n = 1,2, —, Then, from the fact that the variable S„ may be represented 
as the sum of a constant J and an infinitésimal a n , we conclude that S n 
approaches ^ and so 5 = J. In the language of the modem theory of 
limits we may say that for increasing n the variable magnitude S„ 
approaches a limit, which is equal to 

Now let us give a précisé définition of the concepts introduced here. 

If a variable magnitude a„(n = 1, 2, •••) has the property that for every 
arbitrarily small positive number < it is possible to choose an integer N 
so large that for ail n > N we hâve I <*„ | < <, then we say that a n is an 
infinitésimal and we write 

lim a„ = 0 or a„ —► 0. 

n-*oo 

On the other hand, if a variable x n may be represented as a sum 

x„ = a + a„ , 

where a is constant and a„ is an infinitésimal, then we say that the variable 
x n , for n increasing beyond ail bounds, approaches the number a and we 
write 

lim x n — a or x„ —* a. 

The number a is called the limit of x n . In particular the limit of an in¬ 
finitésimal is obviously zéro. 


for arbitrary n > 500. In lhe same way it would be possible to assign arbitrarily small 
values «, or example-. 

c, = 0.0001, «,= 0.00001,-, 

and for each of them to choose, as above, appropriate values N = /V,, N ,, —. 
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Let us consider the following examples of variable magnitudes 


“ n' y * ~ 


_± z = ( -° 

,2 ’ 2 " 


n n 

Vn = (-!)-(« = 1,2,-). 


n — 1 1 


It is clear that x„ ,y n , and z n are infinitesimals, the first of them approach- 
ing zéro through decreasing values, the second through increasing négative 
values, while the third takes on values which oscillate around zéro. 
Further, «„-*!, while v n does not hâve a limit at ail, since with increasing 
n it does not approach any constant number but continually oscillâtes, 
takingon the values 1 and —1. 

Another important concept in analysis is that of an infinilely large 
magnitude, which is defined as a variable x n (n = 1,2, ■••), with the 
property that after choice of an arbitrarily large positive number M it is 
possible to find a number N such that for ail n > N 

I x H I > M. 

The fact that the magnitude x n is infinitely large is written thus 
lim x n = oo or x„ —► co. 

Such a magnitude x n is said to approach infinity. If it is positive (négative) 
from some value on, this fact is expressed thus: x„ -*■ + oo(x n —► — co). 
For example, for n = 1,2, ••• 


lim n 2 = + oo, lim (— n 3 ) = — co; 


lim log - = — co, lim tan (-?-+-) = — oo. 
n \ Z w 


It is easy to see that if a magnitude a„ is infinitely large, then /8„ = l/a n 
is infinitely small, and conversely. 

Two variable magnitudes x n and y n may beadded.subtracted.multiplied, 
and divided the one by the other so as to produce new magnitudes that 
are in general also variable: namely their sum x n + y„ , their différence 
x„ — y„ , their product x„y„, and their quotient xjy„ . Correspondingly 
their particular values will be 


*1 ± y ,, *2 ± y ,, x 3 ± y 3 , ••• 


x iyi » • x 2-t , 2, x zyz . ■■■ 


£i £2 £3 ... 

Ti ’ Ta ’ y 3 ' 



84 


II. ANALYSIS 


It is also possible to prove, as is fairly évident, that if the variables x„ and 
y n approach finite limits, then their sum, différence, product, and quotient 
also approach limits which are correspondingly equal to the sum, différence, 
product, and quotient of these limits. This fact may be expressed thus: 

lim (x„ ± y„) = lim ± lim y n \ lim (x„y„) = lim lim y „; 

i im î=_ iïïüt, 

yn lim y n 

However, in the case of the quotient it is necessary to assume that the 
limit of the denominator (lim y n ) is not equal to zéro. If lim y„ = 0 
and lim x n ?£ 0, then the ratio of x„ to y n will not hâve a finite limit but 
will approach infinity. 

Especially interesting, and at the same time important, is the case when 
the numerator and the denominator simultaneously approach zéro. Here 
it is impossible to State in advance whether the ratio xjy n will approach 
a limit, and if it does, what that limit will be, since the answer to this 
question dépends entirely on the character of the approach of x„ and y n 
to zéro. For example, if 

1 1 (-1)" , , . x 
- — < - n= 1 ’ 2 ’ 

then 

h. = I —0,— = n — oo. 

*n n y n 

On the other hand, the magnitude 

Y = 
z n 

evidently does not approach any limit. 

Thus the case when the numerator and the denominator of the fraction 
both approach zéro cannot be dealt with in advance by general theorems, 
and for each particular fraction of this kind it is necessary to make a 
spécial investigation. 

We shall see later that the fundamental problem of the differential 
calculus, which may be considered as the problem of determining the 
velocity of a nonuniform motion at a given moment, reduces to determining 
the limit of the ratio of two infinitésimal magnitudes, namely the increase 
of the distance covered and the increase in the time. 

So far we hâve considered variables x„ which take on a sequence of 
numerical values x,, x 2 , x 3 , ••• , x n , ••• , while the index n runs through 
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the sequence of natural numbers n = 1, 2, 3, ••• . But it is also possible 
to consider the case that n varies continuously, like the time for example, 
and here also to détermine the limit of the variable x n . The properties 
of such limits are completely analogous to those formulated earlier for 
discrète (that is, discontinuous) variables. We also note that there is no 
spécial significance in the fact that n increases beyond ail bounds. It is 
equally possible to consider the case that, while varying continuously, 
n approaches a given value n 0 . 

As an example let us investigate the variation in the magnitude of 
(sin x)/x as x approaches zéro. Table 1 shows the values of this magnitude 
for certain values of *: 


Table 1 


X 

sin x 
x 

0.50 

0.9589 ... 

0.10 

0.9983 ... 

0.05 

0.9996... 

... 

... 


(it is assumed that the values of x are given in radian measure). 

It is obvious that as *approacheszero the magnitude (sin Jc)/*approaches 
1, but of course we must still give a rigorous proof of this fact. The proof 
may be obtained, for example, from the following inequality, which is 
valid for ail nonzero angles in the first quadrant: 

sin x < x < tan x. 

If we divide both sides of this inequality by sin x, we obtain 


from which follows 



sin x cos x ’ 


cosx < 


sin x 
x 


< l. 


But as x decreases to zéro cos x approaches 1, so that the magnitude 
(sin x)/x, being contained in the interval between cos x and 1, also 
approaches 1, that is 


sinx 

km- 

*-o x 


= 1 . 
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We shall hâve occasion below to make use of this fact. 

Our équation has been proved for the case that x approaches zéro 
through positive values. But by changing the proof in an obvious way, 
it is possible to obtain the same resuit when x approaches zéro through 
négative values. 

Let us now discuss for a moment the following question. A variable 
magnitude may or may not hâve a limit and the question arises whether 
it is possible to give a criterion for determining the existence of a limit 
for a variable. We will confine ourselves to an important and sufficiently 
general case, for which such a criterion can be given. Let us suppose that 
the variable magnitude x n increases or at least does not decrease; that is, 
it satisfies the inequalities 


■*1 ^ X 2 ^ *3 ^ • 

and let us also suppose we hâve determined that none of its values exceeds 
a certain fîxed number M\ that is, x„ ^ M (n = 1, 2, •••)• If we mark 
the values of x n and the number M on the x-axis, we see that the variable 
point x„ moves along the axis to the right but constantly remains to the 
left of the point M. It is rather obvious that the variable point x n must 
inevitably approach a certain limit point a, situated to the left of M or at 
most coinciding with M. 

So, in the case under considération, the limit 

lim x„ = a 

of our variable exists. 

The above argument has an intuitive character but we may consider 
it as a proof. In a course in modem analysis a complété proof of this 
fact is given on the basis of the theory of real numbers. 

As an example let us consider the variable 

u„ = (* + ^r ( ” = l - 2 ’ 3 -"')- 

The first few values are u, = 2, u 2 = 2.25, u 3 2.37, u t «w 2.44, •••, 
which are seen to increase. From the binomial theorem of Newton it is 
possible to prove that this increase holds for arbitrary n. Moreover, it is 
also easy to prove that for ail n the inequality u„ < 3 is valid. Conse- 
quently, our variable must hâve a limit which is not greater than 3. We 
shall see that this limit plays a very important rôle in mathematical 
physics and in a certain sense is the most natural base for logarithms of 
numbers. 
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It is customary to dénoté this limit by the letter e. It is equal to 
<? = lim f 1 +-)” = 2.718281828459045 - . 

A more detailed analysis shows that the number e is not rational.* 

It is also possible to show that the limit under considération exists 
and is equal to e not only when n -*■ + oo but also when n -*■ — oo. In 
both cases n may also take on noninteger values. 

Let us mention an important application to physics of the concept 
of a limit. It consists of the remarkable fact that only by using the concept 
of a limit (passage to the limit) is it possible for us to give a complété 
définition of many of the concrète magnitudes encountered in physics. 

Let us also consider for the moment the following géométrie example. 
In elementary geometry the figures considered first are those bounded by 
straight line segments. But later there arises the more difficult task of 
finding the length of the circumference of a circle with given radius. 

If we analyze the difficulties connected with the solution of this problem, 
we find that they reduce to the following. 

We must give an answer to the question, what is meant by the length 
of the circumference; that is, we must give a précisé définition of this 
length. It is essential that the définition should be expressible in terms 
of the lengths of straight-line segments and also that it should provide 
us with the possibility of effectively calculating the length of the circum¬ 
ference. 

It is understood, of course, that the resuit of this calculation should 
be in agreement with practical expérience. For example, if we consider 
a circumference consisting of an actual thread, then, if we eut the thread 
and stretch it out, we must obtain a segment whose length, within the 
limits of accuracy of measurement, coincides with our computed length. 

As is known from elementary geometry, the solution of this problem 
reduces to the following définition. The length of a circumference is 
defined to be the limit approached by the perimeter of a régulai polygon 
inscribed in it as the number of sides of the polygon increases beyond 
ail bounds. Thus the solution of the problem is based essentially on the 
concept of a limit. 

The length of an arbitrary smooth curve is defined in the same way. 

* In this connection we should remark that addition, subtraction, multiplication, and 
division (excluding division by zéro) of rational numbers, that is numbers of the form 
plq where p and q are integers, leads to rational numbers. But this is not necessarily 
the case for the operation of taking a limit. The limit of a sequence of rational numbers 
may be irrational number. 

t It is not important that the polygon should be regular. The only essential feature 
is that the greatest side of the variable inscribed polygon should approach zéro. 
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In the paragraphs just following, we will meet with a number of examples 
of géométrie and physical magnitudes that can be defined only with the 
concept of a limit. 

The concepts of limit and infinitésimal were given a definitive formula¬ 
tion at the beginning of the last century. The définitions introduced 
here are connected with the name of Cauchy, before whose time mathe- 
maticians operated with concepts that were less clear. The present-day 
concepts of a limit, of an infinitésimal as a variable magnitude, and of a 
real number, resulted from the development of mathematical analysis 
and were at the same time the means of stating and clarifying its many 
achievements. 

§4. Continuous Functions 

Continuous functions form the basic class of functions for the operations 
of mathematical analysis. The general idea of a continuous function may 
be obtained from the fact that its graph is continuous; that is. its curve 
may be drawn without lifting the pencil from the paper. 

A continuous function gives the mathematical expression of a situation 
often encountered in practical life, namely that to a small increase in an 
independent variable there corresponds a small increase in the dépendent 
variable, or function. Excellent examples of a continuous function are 
given by the various rules governing the motion of bodies s = f(t), 
expressing the dependence of the distance s on the time /. Since the time 
and the distance are continuous, a law of motion of the body s = f(t) 
sets up between them a definite continuous relation, characterized by 
the fact that to a small increase in the time corresponds a small increase 
in the distance. 

Mankind arrived at the abstraction of continuity by observing the 
surrounding so-called dense media, namely solids, liquids, and gases; 
for example, metals, water, and air. In actual fact, as is well known now, 
every physical medium represents the accumulation of a large number of 
separate particles in motion. But these particles and the distances between 
them are so small in comparison with the dimensions of the media in 
which the phenomena of microscopie physics take place that many of these 
phenomena may be studied with sufficient accuracy if we consider the 
medium as being approximately without interstices, that is as continuously 
distributed over the occupied space. It is on such an assumption that 
many of the physical sciences are based, for example, hydrodynamics, 
aerodynamics, and the theory of elasticity. The mathematical concept 
of continuity naturally plays a large rôle in these sciences, and in many 
others as well. 
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Let us consider an arbitrary function y = f(x) and some spécifie value 
of the independent variable x 0 . If our function reflects a continuous 
process, then to values x which differ only slightly from x a will correspond 
values of the function/(x) differing only slightly from the value f(x 0 ) 
at the point x 0 . Thus if the incrément x — x„ of the independent variable 
is small, then the corresponding incrément f(x) — f(x 0 ) of the function 
will also be small. In other words if the incrément of the independent 
variable x — x 0 approaches zéro, then the incrément /(x) — f(x 0 ) of the 
function must also approach zéro, a fact which may be expressed in the 
following way: 


lim Jf(x)-f(x 0 )] = 0. (8) 

x—x # -»0 

This relation constitutes the mathematical définition of continuity of the 
function at the point x 0 ; namely, the function /(x) is said to be continuous 
at the point x 0 , if equality (8) holds. 

Finally, we give the following définition. A function is said to be 
continuous in a given interval , if it is continuous at every point x 0 of this 
interval; that is, if at every such point equality (8) is fulfilled. 

Thus, in order to introduce a mathematical définition of the property 
of a function reflected in the fact that its graph is continuous (in the 
everyday sense of this word), it was necessary first to define local continuity 
(continuity at the point x 0 ) and then on this basis to define continuity 
of the function in the whole interval. 

This définition, first introduced at the beginning of the last century by 
Cauchy, is now generally adopted in contemporary mathematical analysis. 
The test of many concrète examples has shown that it corresponds very 
well to the practical notion we hâve of a continuous function, for instance, 
as represented by its continuous graph. 

As examples of continuous functions, the reader may consider the 
elementary functions well known to him from school mathematics x n , 
sin x, cos x, a x , log x , arc sin x, arc cosx. AU these functions are continuous 
in the intervals for which they are defined. 

If continuous functions are added, subtracted, multiplied, or divided 
(except for division by zéro), the resuit is also a continuous function. 
But in the case of division the continuity is usually destroyed for those 
values x 0 for which the function in the denominator vanishes. The resuit 
of the division in that case is a function which is discontinuous at the point 
* 0 . 

The function y = l/x may serve as an example of a function which 
is discontinuous at the point x = 0. Other discontinuous functions are 
represented by the graphs in figure 9. 
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Fig. 9d. 


Fig. 9e. 
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We recommend that the reader examine these graphs carefully. He will 
notice that the breaks in the functions are different kinds: In some cases 
a limit/(x) exists as x approaches the point x 0 where the function sufïers 
a discontinuity, but this limit is different from f(x 0 ). In other cases, as in 
figure 9c, the limit simply does not exist. It may also happen that as x 
approaches x 0 from one side f(x) — f(x 0 ) 0, but as x -*• x 0 from the 

other side, /(x) — f(x 0 ) does not approach zéro. In this case, of course, 
the function has a discontinuity, but we may say that at such a point it is 
“continuous from one side.” AU these cases are represented in the graphs 
of figure 9. 

As an exercise we recommend to the reader to consider the question, 
what value must be given to the functions 


sinx 1— cosx x 3 — ! tanx 



at those points where they are not defined (that is, at the points where the 
denominator is equal to zéro), in order that they may be continuous at 
these points. Also, is it possible to find such numbers for the functions 


tan x,-- 

x — 1 


2 9 

(x 2 — 4) 


These discontinuous functions in mathematics represent the numerous 
jumplike processes to be met with in nature. In the case of a sudden blow, 
for example, the value of the velocity of a body changes in such a jump¬ 
like fashion. Many qualitative transitions take place with such jumps. 
In §2 we introduced the function Q = J[t), expressing the way in which 
the quantity of heat in a given quantity of water (or ice) dépends on the 
température. In the neighborhood of the melting point of ice the quantity 
of heat Q = J\t) changes in a jumplike fashion with changing t. 

Functions with isolated discontinuités are encountered quite often 
in analysis, along with the continuous functions. But as an example of a 
more complicated function, where the number of discontinuités is in¬ 
finité, let us consider the socalled Riemann function, which is equal to 
zéro at ail irrational points and equal to \/q at rational points of the form 
x = p/q (where plq is a fraction in its lowest terms). This function is 
discontinuous at ail rational points and continuous at irrational points. 
By altering it slightly we may easily obtain an example of a function which 
is discontinuous at ail points.* Let us remark by the way that even for 
such complicated functions modem analysis has discovered many in- 


* It is sufficient to set the function equal to unity at the irrational points. 
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teresting laws, which are investigated in one of the independent branches 
of analysis, the theory of functions of a real variable. This theory has 
developed with extraordinary rapidity during the past 50 years. 

§5. Dérivative 

The next fundamental concept of analysis is the concept of dérivative. 
Let us consider two problems from which it arose historically. 

Velocity. At the beginning of the présent chapter we defined the velocity 
of a freely falling body. To do so we made use of a passage to the limit 
from the average velocity over short distances to the velocity at the given 
point and the given time. The same procedure may be used to define the 
instantaneous velocity for an arbitrary nonuniform motion. In fact, let 
the function 

s=f(D (9) 

express the dependence of the distance s covered by the material point 
in the time t. To find the velocity at the moment t = let us consider 
the interval of time from t 0 to f 0 + h (h 0). During this time the point 
will cover the distance 

ds = /(to + h) -f(l 0 ). 

The average velocity u» v over this part of the path will dépend on h 

Vav = = |[/(to + h) -/(t 0 )], 

and will represent the actual velocity at the point /„ with greater and 
greater accuracy as h becomes smalier. It follows that the true velocity 
at the time t 0 is equal to the limit 

v = lim 

A-o h 

of the ratio of the increase in the distance to the increase in the time, 
as the latter approaches zéro without ever being actually equal to zéro. 
In order to calculate the velocity for different forms of motion, we must 
discover how to find this limit for various functions f(t). 

Tangent. We are led to investigate a precisely analogous limit by 
another problem, this time a géométrie one, namely the problem of 
drawing a tangent to an arbitrary plane curve. 
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Let the curve C be the graph of a function y = f(x), and let A be the 
point on the curve C with abscissa x 0 (figure 10). Which straight line shall 
we call the tangent to C at the point Al In elementary geometry this 
question does not arise. The only curve studied there, namely the circum- 
ference of a circle, allows us to define the tangent as a straight line which 
has only one point in common with the curve. But for other curves such 
a définition will clearly not correspond to our intuitive picture of “tangen- 
cy.” Thus, of the two straight Unes L and M in figure 11, the first is ob- 
viously not tangent to the curve drawn there (a sinusoidal curve), although 
it has only one point in common with ît ; while the second straight line 
has many points in common with the curve, and yet it is tangent to the 
curve at each of these points. 

To define the tangent, let us consider on the curve C (figure 10) another 
point A', distinct from A, with abscissa x 0 -F h. Let us draw the sécant 
AA' and dénoté the angle which it forms with the x-axis by j8. We now 
allow the point A' to approach A along the curve C. If the sécant AA' 
correspondingly approaches a limiting position, then the straight line T 
which has this limiting position is called the tangent at the point A. 
Evidently the angle a formed by the straight line T with the x-axis, must 
be equal to the limiting value of the variable angle £. 

The value of tan £ is easily determined from the triangle A B A' (figure 
10 ): 

BA' /(x 0 + h) -)\x 0 ) 
tanp- —- - . 

For the limiting position we must hâve 

tan a = lim tan jB = I.m + h) ~^ Xo) , 
a'-*a h -.o h 



Fig. 10. 


Fig. 11. 
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that is, the trigonométrie tangent of the angle of inclination of the tangent 
line is equal to the limit of the ratio of the increase in the function f(x) 
at the point x 0 to the corresponding increase in the independent variable, 
as the latter approaches zéro without ever being actually equal to zéro. 

Let us give still another example Ieading to the calculation of an analog- 
ous limit. Let us suppose that a variable electric current is flowing through 
a conductor. Let us assume that we know the function Q = f(t) expressing 
the quantity of electricity that has passed through a fixed cross section 
of the conductor up to time /. In the period from t 0 to t„ + h, there will 
flow through this cross section a quantity of electricity AQ equal to 
f(t o + h) — f(t 0 ). The average value of the current will therefore be 
equal to 

/av _ -- - . 

The limit of this ratio as h -*■ 0 will give us the value of the current at 
the time i 0 

/-n*^ ±4=m. 

'•-o h 

Ail the three problems discussed, in spite of the fact that they refer to 
different branches of science, namely mechanics, geometry, and the 
theory of electricity, hâve led to one and the same mathematical operation 
to be performed on a given function, namely to find the limit of the ratio 
of the increase of the function to the corresponding increase h of the 
independent variable as h-+0. The number of such widely different 
problems could be increased at will, and their solution would lead to the 
same operation. To it we are led, for example, by the question of the rate 
of a Chemical reaction, or of the density of a nonhomogeneous mass and 
so forth. In view of the exceptional rôle played by this operation on 
functions, it has received a spécial name, différentiation, and the resuit 
of the operation is called the dérivative of the function. 

Thus, the derivattve of the function y = f(x), or more precisely, the 
value of the dérivative at the given point x is the limit* approached by the 
ratio of the increase f(x + h) — f{x) of the function to the increase h 
of the independent variable, as the latter approaches zéro. We often write 
h = Ax, and f(x + Ax) —f(x) = Ay, in which case the définition of the 
dérivative is written in the concise form: 

lim -t— . 
az~o Ax 

* It is understood that we are speaking here of the case where the limit in question 
actually exists. If this limit does not exist, then we say that at the point x the function 
does not hâve a dérivative. 
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The value of the dérivative obviously dépends on the point x at which 
it is found. Thus the dérivative of a function y = f(x) is itself a function 
of x. It is customary to dénoté the dérivative thus 


/'(*) = l.im 

h -*0 


A* + h) -f(x) 
h 


lim 

JZ-.0 


*y_ 

Ax ‘ 


Certain other notations are also customary for the dérivative: 


df(x) dy , , 

J±-,or Tx ,ory,oTy,. 

We should also remark that the notation ^ looks like a fraction, al- 

dx 

though it is read as a single symbol for the dérivative. In the following 
sections the numerator and the denominator of this “fraction” will take 
on independent meaning, in such a way that their ratio will coincide with 
the dérivative so that this manner of writing is completely justified. 

The results of these examples may now be formulated as follows. 
The velocity of a point for which the distance s is a given function of 
the time s = J\t) is equal to the dérivative of this function 


v = s' = AO- 

More concisely, the velocity is the dérivative of the distance with respect 
to time. 

The trigonométrie tangent of the angle of inclination of the tangent 
line to the curve y = J{x) at the point with abscissa x is equal to the 
dérivative of the function f(x) at this point: 


tan ex — y' = f'(x). 


The strength of the current / at the time t, if Q = f(l) is the quantity 
of electricity which up to time t has passed through a cross section of the 
conductor, is equal to the dérivative 


I=Q = /'(,). 


Let us make the following remark. The velocity of a nonuniform motion 
at a given time is a purely physical concept, arising from practical ex¬ 
périence. Mankind arrived at it as the resuit of numerous observations 
on different concrète motions. The study of nonuniform motion of a 
body on different parts of its path, the comparison of different motions 
of this sort taking place simultaneously, and in particular the study of the 
phenomena of collisions of bodies, ail represented an accumulation of 
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practical expérience that led to the setting up of the physical concept 
of the velocity of a nonuniform motion at a given time. But the exact 
définition of velocity necessarily depended upon the method of defining 
ils numerical value, and to define this value was possible only with the 
concept of the dérivative. 

In mechanics the velocity of a body moving according to the rule 
s = /(/) at the time t is defined as the dérivative of the function/(/) for this 
value of t. 

The discussion at the beginning of the présent section has shown, 
on the one hand, the advantages of introducing the operation of finding 
the dérivative, and on the other has given a reasonable justification for 
the above formulated définition of the velocity at any given moment. 

Thus, when we raised the question of finding the velocity of a point 
in nonuniform motion we had, properly speaking, only an empirical 
notion of its value but no exact définition. But now, as a resuit of our 
analysis, we hâve reached an exact définition of the value of the velocity 
at a given moment, namely the dérivative of the distance with respect 
to the time. This resuit is extremely important from a practical point of 
view, since our empirical knowledge of the velocity has been greatly 
enriched by the fact that we can now make an exact numerical calculation. 

What has just been said refers equally well, of course, to the strength 
of a current and to many other concepts expressing the rate of some 
process, physical, Chemical, and so forth. 

This situation may serve as an example for numerous others of a 
similar nature, where practical expérience has led to the formation of a 
concept relating to the external world (velocity, work, density, area, 
and so forth) and then mathematics has enabled us to define this concept 
precisely, whereupon we can make use of the concept in practical cal¬ 
culations. 

We hâve already noted at the beginning of the chapter that the concept 
of a dérivative arose chiefly as the resuit of many centuries of effort 
directed towards the solving of two problems: drawing a tangent to a 
curve and finding the velocity of a nonuniform motion. These problems, 
and also the calculation of areas discussed later, interested mathematicians 
in ancient times. But until the 16th century the statement and the method 
of solution for each problem of this sort bore an extremely spécial charac- 
ter. The accumulation of ail this extensive material was reduced to a 
theoretically complété System in the 17th century in the work of Newton 
and Leibnitz. An important contribution to the foundations of present- 
day analysis was also made by Euler. 

But it must be said that Newton and Leibnitz and their contemporaries 
provided very littJe logical basis for their great mathematical discoveries; 
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in their methods of reasoning and in the concepts with which they operated 
there was much that is unclear from our point of view. Even at that time 
the mathematicians themselves were quite conscious of this, as is shown 
by the embittered discussions to be found in their correspondence with 
oneanother. However, these mathematicians of the 17thand 18th centuries 
carried on their purely mathematical activities in very close association 
with the research of other investigators, in the various branches of natural 
science (physics, mechanics. chemistry, technology). The statement of a 
mathematical problem usually arose from practical needs or from a wish 
to understand some phenomenon of nature, and as soon as the problem 
was solved, the solution was submitted in one way or another to a practical 
test. Consequently, in spite of a certain lack of logical basis, mathematics 
was able to advance in extremely useful directions. 


Examples for the calculation of dérivatives. The définition of the 
dérivative as the limit 


f (X) - VJ?- f - 

allows us to calculate the 
dérivative of any given con¬ 
crète function. 

Of course, it must be 
admitted that cases are 
possible where the function 
at one point or another or 
even at many points simply 
does not hâve a dérivative; 
in other words, the ratio 

/(* + h) -/(*) 
h 



Fig. 12. 


as h -* 0 does not approach a finite limit. This case obviously occurs 
at every point of discontinuity of the function f(x), since here the ratio 


A* + h) -/(*) 

h 


( 10 ) 


has a numerator which does not approach zéro while the denominator 
decreases without bound. The dérivative may also fail to exist at a point 
where the function is continuous. A simple example is given by any 
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point where the graph of the function forms an angle (figure 12). At such 
a point the curve of the graph has no definite tangent, and consequently 
the function has no dérivative. Often at such points the expression (10) 
approaches different values, depending on whether h approaches zéro 
from the right or from the left, so that if h approaches zéro in an arbitrary 
manner, the ratio (10) simply has no limit. An example of a more com- 
plicated function without a dérivative is given by 

1 x sin - for x ^ 0, 

* 

0 for x = 0. 

The graph of this function is drawn in figure 13. At the point x = 0 
it has no dérivative because, as is évident from the graph, the sécant OA 
does not approach any definite position even when A—* 0 from one side. 
In fact, the sécant OA oscillâtes endlessly back and forth between the 
straight line OM and the straight line OL. The corresponding ratio (10) 
in this case has no limit, even if h préserves the same sign as it approaches 
zéro. 



Fig. 13. 
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Let us remark finally that it is possible to define, in a purely analytic 
way by means of a formula, a continuous function which does not hâve 
a dérivative at any point. An example of such a function was first given 
by the outstanding German mathematician of the last century, Weierstrass. 

Consequently the class of différentiable functions is considerably 
narrower than that of continuous functions. 

Let us pass now to the actual calculation of the dérivatives of the 
simplest functions. 

1. y = c, where c is a constant. A constant may be considered as a 
spécial case of a function that remains equal to the same number for 
arbitrary x. Its graph is a straight line parallel to the x-axis at a distance 
equal to c. This straight line forms with the x-axis an angle a = 0, and 
obviously the dérivative of a constant is identically equal to zéro: 
ÿ = (c)' = 0. From the point of view of mechanics, this équation 
means that the velocity of a fixed point is equal to zéro. 

2. y = x 2 


/(x + h) -Ax) _ (x + h)* - x 2 = 2x + h 
h h 

As h -*■ 0 we obtain* in the limit 2x; consequently 

y' - (x*Y = 2x. 

3. y = x" (n a positive integer). 

/(x + h)-Ax) _ (x + h)” - x" 
h h 

x" + nx n -'h + ~^- l -x n ~ 2 h 2 + ■■■ + h” — x” 

= h 

= nx- 1 + «" -i lx-* + - + h ”~ l . 

Every addend on the right side, beginning with the second, approaches 
zéro as h — 0; consequently 

ÿ = (x n )' = nx"- 1 . 

This formula remains true for arbitrary n positive or négative, fractional 


We always assume here that h^é. 0. 
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or even irrational, although the proof must then be different. We will 
make use of this fact without proving it. Thus for example 

( Vx)' = (xi)' = \ x~i = — X —= , (x > 0); 

1 2Vx 

(Vxy = ( X iy = \ x-i = j±= ,(x*oy, 

(t) < = <*"*>' = -1 ‘ *' 2 = - -1, (x * 0); 

(x")' = nx*-\ (x > 0). 

4. y = sin x. 

sin (x + h) — sin x 2 sin A/2 cos (x + h/2) sin A/2 / , A\ 

-Â-"-A -- = ~À/2 C ° S \ 2' ' 

As explained earlier, the first fraction approaches unity as A -► 0, and 
cos (x + A/2) obviously approaches cos x. Thus the dérivative of the sine 
is equal to the cosine 

ÿ = (sin x)' = cos x. 

We suggest to the reader that by the same sort of argument he prove that 

(cos x)' = —sin x. 

5. Earlier (Chapter II, §3) we hâve already noted the existence of the 
limit 

lim (l +!)" = e = 2.71828 - . 

n-KX> \ r\l 

We also remarked that for the calculation of this limit no essential rôle 
is played by the fact that n took on only positive intégral values, lt is 
important only that the infinitésimal l/n, which is being added to unity, 
and the exponent n, which is increasing beyond ail bounds, should be 
reciprocal to each other. 

Making use of this assertion, we can easily find the dérivative of the 
logarithm y = log„ x 


logq (X + h) - log„ x 

A 




îM 1 + 


r 


X 
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The continuity of the logarithm allows us to replace the quantity under 
the log sign by its limit, which is equal to e\ thus 


lim 

h-0 


(■+4f 


e 


(in this case the rôle of n -* co is played by the increasing quantity x/h). 
As a resuit, we obtain the rule for differentiating a logarithm 


(log a Ar)' = -log a <?. 


This rule becomes particularly simple if as the base of our logarithms we 
choose the number e. Logarithms taken to this base are called natural 
logarithms and are denoted by ln x. We may write 

(log, x)' = l - 

or again 


§6. Rules for Différentiation 

From the examples given earlier it may appear that the calculation of 
the dérivative of every new function demands the invention of new methods. 
This is not the case. The development of analysis was made possible to no 
small extent by the discovery of a simple unified method for finding the 
dérivative of an arbitrary "elementary” function (that is, a function which 
may be expressed by a formula consisting of a finite combination of the 
fundamental algebraic operations, the trigonométrie functions, the 
operation of raising to a power, and the taking of logarithms). At the 
basis of this method are the so-called rules of différentiation. They consist 
of a number of theorems that allow us to reduce more complicated 
problems to simpler ones. 

We will explain here the rules of différentiation and will try to be very 
brief in deducing them. If the reader wishes to form merely a general 
idea of analysis, he may omit the présent section, remembering only 
that there exists a means of actually finding the dérivative of any element¬ 
ary function. In this case it will be necessary, of course, for him to take 
on faith some of the calculations in our later examples. 


Dérivative of a sum. Assume that y is given as a function of x by the 
expression 

y = <ffx) -I- </-(*), 
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where u = <f>(x) and v = >fj(x) are known functions of x. We assume 
moreover that we can find the dérivatives of the functions u and v. How 
then are we to find the dérivative of the function y ? The answer is simple 

y' = (u + v)' = u + v'. (11) 

In fact, let us give x an incrément Ax; then u, v , and y will each receive 
an incrément Au, Av, and Ay, connected by the équation 


Thus* 


Ay = Au + dv. 


Ax 


Au 

Ax 


+ 


J» 

Ax’ 


and after the passage to the limit for Ax-* 0 we at once get formula 
(11), if, of course, the functions u and v hâve dérivatives. 

Analogously we may dérivé the formula for differentiating the différence 
of two functions 

(u - v)' = u' - v'. (12) 

Dérivative of a product. The rule for the différentiation of a product 
is somewhat more complicated. The dérivative of the product of two 
functions, each of which has a dérivative, exists, and is equal to the sum 
of the product of the first function by the dérivative of the second and 
the product of the second by the dérivative of the first; that is 

( uv)' = uv' + vu'. (13) 

In fact, let us give x an incrément Ax. Then the functions u, v and 
y = uv will receive the incréments Au, Av, Ay, satisfying the relation 

Ay = (u -f Au)(y + Av) — uv = u Av + v Au + Au Av, 


from which 


Ax 


Av , Au . Av 
U -T- + V — + Au — 
Ax Ax Ax 


After passage to the limit for Ax-* 0 the first two summands on the 
right side produce the right side of formula (13) while the third summand 
vanishes.t Consequently, in the limit we obtain the rule (13). 


* Here Ax is never equal to zéro. 

t The final summand here approaches zéro for Ax — 0, since Av/Ax approaches 
a finite number, equal to the dérivative v', which was assumed rom the beginning to 
exist, and Au — 0, since the function u, assumed to hâve a dérivative, is continuous. 
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In the particular case v = c = const, we hâve 


(eu)' = eu' + uc‘ = eu. 


(14) 


since the dérivative of a constant is equal to zéro. 


Dérivative of a quotient. Let y = u/v, where u and v hâve a dérivative 
for a given x, with v ^ 0 for that value of x. Obviously 


from which 


Jy = 


u + du 
v + Av 


u 

v 


vAu — udv 
(v + Av) v ’ 


Ay 

Ax 


v 


Ay Av 
Ax U ~Âi 
(v + Av) v 



0 ). 


Here we hâve again made use of the fact that for a function v which 
has a dérivative we necessarily hâve Av -► 0, when Ax -* 0. Thus 



(15) 


Let us give sortie examples of the application of these rules 

(2x* - 5)' = 2(x*y - (5)' = 2 • 3x 2 - 0 = 6x 2 ; 

(x 2 sin x)' = x 2 (sin x)' + (x 2 )' sin x = x 2 cos x + 2x sin x\ 

. ,, _ I sin x \' _ cos x(sin x)' — sin x(cos x)' 

' ’ \ cos x ! cos 2 x 

cos x ■ cos x — sin x(— sin x) 1 , 

=- 5 ---- = —î— = sec 2 x. 

cos 2 x cos 2 x 

We recommend to the reader to prove for himself the formula 

(cot x)’ = —esc 2 x. 


Dérivative of the inverse function. Let us considéra function y = f(x), 
which is continuous and increasing (decreasing) on the interval [a, 6]. 
By increasing (decreasing) we mean that to a greater value of x in the 
interval [a, b] corresponds a greater (smaller) value of y (figure 14). 

Let c = f(a) and d = f(b). In figure 14 it is évident that for each value 
of y from the interval [c, d] (or [d, c], respectively) there corresponds 
exactly one value of x from the interval [a, b] such that y = f(x). Thus 
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on the interval [c, d] (or [d, c]) we hâve a completely determined function 
x = <f>(y), which is called the inverse function of y = /(*). In figure 14 it 
is clear that the function <£(y) is continuous, a fact which is proved in 
modem analysis by strictly analytical methods. Now let Ax and Ay 



correspond respectively to the incréments in x and y. It is évident that 


Ay 

Ax 


1 

Ax/Ay 


, if Ay ^0. 


In the limit this gives us a simple relation between dérivatives of the direct 
and inverse functions 

= U6) 

» 

Let us make use of this relation to find the dérivative of the function 
y = o*. The inverse function is x = log„y, which we are already able 
to differentiate, and so we may write 


(a 1 ); = 


1 


1 


Oog„yV \/y (log„ e) 


= y log, a = a x ln a. 


(17) 


In particular (e*)' = e *. 

As another example let us take y — arc sin x. The inverse function 
is x = sin y. Thus 

. . v 1 1 1 1 

(arc sin x) z —■ -=-= -- —_ = — — . 

(siny)^ cos y vl — (sin y) 4 vl — x 2 
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Table of dérivatives. Let us tabulate the dérivatives of the simplest 
elementary functions (Table 2). 



These formulas hâve been calculated and explained earlier, with the 
exception of the last two which the reader may, if he wishes, easily dérivé 
for himself by using the rule for différentiation of an inverse function. 


Calculation of the dérivative of a function of a function. It remains 
to consider the last and most difficult rule for difTerentiation. The reader 
in possession of this rule and of a set of tables may with perfect right 
consider that he is able to differentiate any elementary function. 

In order to apply the rule we are about to give, it is necessary to be 
completely clear about how the function we wish to differentiate is 
constructed; that is, which operations must we perform on the inde- 
pendent variable x, and in which order, to produce the value of the 
dépendent variable y. 

For example, to calculate the function 

y = sin x 2 , 

it is necessary first of ail to raise x to the second power and then to take 
the sine of the magnitude so obtained, a procedure which may be described 
in the following way: y = sin u, where u = x 2 . 

On the other hand, in order to calculate the function 

y — sin 2 x, 

it is necessary first of ail to find the sine of x, and then to raise the value 
so found to the second power, a procedure which may be written thus: 
y — u 2 , where u = sin x. 
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Here are some examples: 

1. y = (3x + 4) 3 , y = u 3 , u = 3x + 4. 

2. y = V\ — x 2 , y = u*,u = 1 — x 2 . 

3. y = e tx ; y = e u , u = kx. 

In more complicated cases we hâve a chain of simple relations, which 
may hâve several links. For example, 

4. y = cos 3 x 2 ; y = u 3 ; u = cos v;o = x 2 . 

If y is a function of the variable u 

y = /(«). (18) 

and u in its turn is a function of the variable x 

u = tfx), (19) 

then >>, being a function of w, is also a certain function of x, which may 
dénoté as follows 

y = F(x)=mx)]. (20) 

By considering more complicated cases we may form, for example, 

the function 

y = <Hx) =AMx )]}, 

which is équivalent to the équations 

y -- /(«), u = #u), v = 4Ax), 

and we could form still longer chains. 

We now show how to calculate the dérivative of the function F(x) 
defined by équation (20) if we know the dérivative of f\u ) with respect 
to u and the dérivative of <f>(x) with respect to x. 

Let us give to x the incrément àx; then by (19) u will receive a certain 
incrément Au and by (18) y will receive an incrément Ay. Thus we may 
write 

Ay _ Ay Au 
Ax Au Ax 

Now let Ax approach zéro. Then Au/Ax-*u' x . Furthermore, from 
the continuity of u, the increase ^m-* 0, and therefore Ay/Au-*y' u (the 
existence of the dérivatives y' u and u' x was assumed). 
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Thus we hâve proved the important formula for the dérivative of a 
function of a function* 

y'x = yWx■ (21) 

Let us calculate, from formula (21) and the fundamental table of 
dérivatives given, the dérivatives of the funotions we hâve been con- 
sidering: 

1. y = (3* + 4)* = u 3 , y' x = («*)'(3* + 4)' x = lu 2 ■ 3 = 9(3* + 4f. 

2. y = VT=T 2 = ui,y' x = («*);(1 -X% = 2x) 

X 

V\ ~ X 2 

3. y = e kx = e u , y' x = • u' x = e u ■ k = ke kx . 

If y = Au), U = #r), v = >P(x), then 

y z = y' u ■ u x = y' u (u' v ■ v' x ) = vû • u v ■ v x . 

It is clear how to generalize this formula for the case of an arbitrary 
(finite) number offunctions in the chain. For example. 

4 - y = cos 3 * 2 ; = (u 3 )^ (cos v)' v ■ (a: 2 )^ = 3« 2 (—sin r) • 2x 

= —6*cos 2 * 2 sin* 2 . 

In our explanation of how to calculate the dérivative of a function of a 
function, we hâve introduced intermediate variables u, i\ ••• . But in fact, 
after a little practice one may dispense with them, simply keeping in mind 
the functions they dénoté. 

The elementary functions. To close the présent section let us remark 
that the functions whose dérivatives were listed in tabular form (Table 2) 
may be used to define the so-called elementary functions. These elementary 
functions are defined as those functions that may be obtained from the 
preceding simple functions by the four arithmetical operations and the 
operation oftakinga function ofa function, each of these operations being 
performed a finite number of times. 

For example, the polynomial x 2 — 2x 2 + 3x — 5 is an elementary 
function since it is obtained by arithmetic operations from a number of 
functions to the form x 1 . The function ln VI — x 2 is also elementary, 


* In deducing this formula we hâve tacitly assumed that, as àx approaches zéro, 
du is never equal to zéro. But the formula remains true even when this assumption 
does not hold. 
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since it is obtained from the polynomial u = 1 — x 2 by the operation 
v = u 112 , and subsequently the operation ln v. 

The rules for différentiation discussed earlier are sufficient to obtain the 
dérivative of any elementary function, as soon as we know the dérivatives 
of the simplest elementary functions. 

§7. Maximum and Minimum; Investigation of the Graphs of Functions 

One of the simplest and most important applications of the dérivative 
is in the theory of maxima and minima. Let us suppose that on a certain 
interval a < x ^ b we are given a function y = f(x) which is not only 
continuous but also has a dérivative at every point. Our ability to cal- 
culate the dérivative enables us to form a clear picture of the graph of the 
function. On an interval on which the dérivative is always positive the 
tangent to the graph will be directed upward. On such an interval the 



function will increase; that is. to a greater value of x will correspond a 
greater value of f(x). On the other hand, on an interval where the dérivative 
is always négative, the function will decrease; the graph will run down- 
ward. 

Maximum and minimum. In figure 15 we hâve drawn the graph of a 
function y = f{x) defined on the interval (a, b). Of a spécial interest are 
the points of this graph whose abcissas are x 0 , at, , x 3 . 

At the point x 0 the function f(x) is said to hâve a local maximum; by 
this we mean that at this point J\x) is greater than at neighboring points; 
more precisely f[x 0 ) > /(x) for every x in a certain interval around the 
point x 0 . 
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A local minimum is defined analogously. 

For our function a local maximum occurs at the points x 0 and x 3 , 
and a local minimum at the point x,. 

At every maximum or minimum point, if it is inside the interval [a, 6], 
i.e., if it does not coincide with one of the end points a or b , the dérivative 
must be equal to zéro. 

This last statement, a very important one, follows immediately from the 
définition of the dérivative as the limit of the ratio Ay/Ax. Fn fact, if we 
move a short distance from the maximum point, then A y ^ 0. Thus for 
positive Ax the ratio Ay/Ax is nonpositive, and for négative Ax the ratio 
Ay/Ax is nonnegative. The limit of this ratio, which exists by hypothesis, 
can therefore be neither positive nor négative and there remains only the 
possibility that it is zéro. By inspection of the diagram it is seen that this 
means that at maximum or minimum points (it is customary to leave 
out the word “local,” although it is understood) the tangent to the graph 
is horizontal. In figure 15 we should remark that at the points x 2 and x, 
also the tangent is horizontal, just as it is at the points x 0 , x t , x 3 , although 
at these points the function has neither maximum nor minimum. In 
general, there may be more points at which the dérivative of the function 
is equal to zéro (stationary points) than there are maximum or minimum 
points. 

Détermination of the greatest and least values of a function. In 

numerous technical questions it is necessary to find the point x at 
which a given function./(x) attains its greatest or its least value on agiven 
interval. 

In case we are interested in the greatest value, we must find x 0 on the 
interval [a, b] for which among ail x on [a, b] the inequality /(*„) ^ /(x) 
is fulfilled. 

But now the fundamental question arises, whether in general there 
exists such a point. By the methods of modem analysis it is possible to 
prove the following existence theorem: If the function /(x) is continuous 
on a finite interval, then there exists at least one point on the interval 
for which the function attains its maximum (minimum) value on the 
interval [a, b]. 

From what has been said already, it follows that these maximum or 
minimum points must be sought among the “stationary” points. This 
fact is the basis or the following weil-known method for finding maxima 
and minima. 

First we find the dérivative ofJ(x) and then solve the équation obtained 
by setting it equal to zéro 

m = o. 
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If x x , x x , ••• , x„ are the roots of this équation, we then compare the 
numbers f(x x ),f(x 2 ), •••,/(*„) with one another. Of course, it is necessary 
to take into account that the maximum or minimum of the function may 
be found not within the interval but at the end (as is the case with the 
minimum in figure 15) or at a point where the function has no dérivative 
(as in figure 12). Thus to the points x x , x 2 , •••, x„ we must add the ends 
a and b of the interval and also those points, if they exist, at which there 
is no dérivative. It only remains to compare the values of the function 
at ail these points and to choose among them the greatest or the least. 

With respect to the stated existence theorem, it is important to add 
that this theorem ceases, in general, to hold in the case that the function 
f(x) is continuous only on the interval (a, b); that is, on the set of points 
x satisfying the inequalities a < x <b. We leave it to the reader to consider 
the fact that the function ï/x has neither a maximum nor a minimum 
on the interval (0, 1). 

Let us look at some examples. 

From a square piece of tin of side a it is required to make a rectangular 
open box of maximum volume. If from the corners of the original square 
we take away squares of side x (see §2, example 2) we get a box with the 
volume 

V = x(a — 2x) 2 . 

Ourproblem then becomes to find the value of x for which the function 
Y(x) attains its greatest value on the interval 0 < x < a/2. In accordance 
with the rule, we find the dérivative and set it equal to zéro 

Y'(x) = (a - 2x) 2 - 4 x(a - 2x) = 0. 

Solving this équation, we find the two roots 

a a 

*' = 2 ' X * = 6 • 

To these we adjoin the left end of the interval (the right end is identical 
with x t ) and compare the values of the function at these points 

K°)-°; ^(|)^©--» 

Thus the box will hâve the greatest volume, equal to 2/27 a 3 , for the 
height x = a/6. 

As a second example, let us examine the problem of the lamp at the 
skating rink (see §2, example 3). At what height h should we place the 
lamp in order that the edge of the rink may receive the greatest illumina¬ 
tion? 
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For formula (3) §2, our problem reduces to determining the value of h 
for T = A sin a/h 2 + r 2 takes on its greatest value. Instead of h it is 
more convenant here to find the angle a (figure 3, Chapter I). We hâve 


so that 


h = r tan a. 


A sin a 
r 1 1 + tan 2 a 


-z sin a COS 2 a. 
r 2 


Then it is required to find the maximum of the function T(a) among 
those values of a which satisfy the inequality 0 < a < tt/2. To do this, 
we find the dérivative and set it equal to zéro 


T(c k) = — (cos 3 a — 2 sin 2 a cos a) = 0. 


This équation splits into two 

cos a = 0, cos 2 a — 2 sin 2 a = 0. 

The first équation has the root a = w/2, which coincides with the end 
of the interval (0, n/2). The second équation may be put in the form 

tan 2 a = \ . 

But since 0 < a < rr/2, we hâve the resuit a 35°15'. So this is the value 
for which the function T( a) attains its maximum (at the ends of the interval, 
T = 0). The desired height h is thus equal to 

h = r tan a = î%! 0.7 r. 

v2 

For best illumination of the edge of the rink the lamp should be placed 
at a height equal to about 0.7 times the radius. 

But now let us suppose that the facilities at our disposai do not allow 
us to raise the lamp to a height greater than a certain H. Then the angle 
a may vary not from 0 to tt/2 but only within the narrower Iimits 
0 < a ^ arc tan ( H/r ). For example, letr = 12metersand H = 9 meters. 
In this case, it is in fact possible to raise the lamp to the height h = r/y/2, 
which amounts to somewhat more than 8 meters, so that this is what we 
ought to do. But if H is less than 8 meters (for example, if we hâve at 
our disposai only a pôle of length 6 meters), then it turns out that the 
dérivative of the function 7fa) in the interval [0, arc tan (H/r)] is nowhere 
equal to zéro. In this case the maximum is attained at the end of the 
interval, and the lamp should be raised to the greatest possible height 
H = 6 meters. 
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Up to now we hâve considered a function on a finite interval. If the 
interval is infinité in length, then even a continuous function may fail 
to attain its greatest or least value but may, for example, continue to 
grow or to decrease as x approaches infinity. 

Thus the functions y = kx + b (see figure 5, Chapter I), y — arc 
tan x (figure 16a), y = ln x (figure 16b) nowhere attain either a 


y < 

1 

TT 

2 - ^ 




/ * 

-4-èj^ 

0 1 2 3 4 * 

/ 


TT 

/ 


y 

| y-lnx 

y : arc ton x \ 


Fig. 

16a. 

Fig. 16b. 


maximum or a minimum. The function y = e~ zî (figure 16c) attains 
its maximum at the point x = 1, but nowhere attains a minimum. As for 
the function y = x/(\ + x 2 ) (figure 16d), it reaches its minimum at the 
point x = — 1 and its maximum at the point x = 1. 




Fig. 16d. 
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In the case of an interval of infinité length the investigation may be 
reduced to the ordinary rules. It is only necessary to consider in place 
of d(a) and f(b) the limits 

A = lim f[x), B — lim j\x). 

x —co ’ x-.»œ J ’ 


Dérivatives of higher orders. We hâve just seen how, for doser study 
of the graph of a function, we must examine the changes in its dérivative 
f\x). This dérivative is a function of x, so that we may in turn find its 
dérivative. 

The dérivative of the dérivative is called the second dérivative and is 
denoted by 

[/]' = y" or lf(x)Y = /"(*). 

Analogously, we may calculate the third dérivative 

[/■]' = /" or [/"(*)]' =/"'(*) 

and more generally the nth dérivative or, as it is also called, the dérivative 
of nth order 

y in) = /<»>(*). 

Of course, it must be kept in mind that, for a certain value of x (or even 
for ail values of x) this sequence may break off at the dérivative of some 
order, say the Arth; it may happen that f ik) (x) exists but not f ik+t, (x). 
Dérivatives of arbitrary order will appear later in §9 in connection with 
the Taylor formula. For the moment we confine ourselves to the second 
dérivative. 

Significance of the second dérivative; convexity and concavity. The 
second dérivative has a simple significance in mechanics. Let j = f(t) 
be a law of motion along a straight line; then s' is the velocity and s" 
is the “velocity of the change in the velocity” or more simply the “accéléra¬ 
tion” of the point at time /. For example, for a falling body under the 
force of gravity 

gt 2 

s — 2 "F "F i 0 • 

i' = gt + ”o . 
s" = g , 

that is, the accélération of falling bodies is constant. 

The second dérivative also has a simple géométrie meaning. Just as the 
sign of the first dérivative détermines whether the function is increasing 
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or decreasing, so the sign of the second dérivative détermines the side 
toward which the graph of the function will be curved. 

Suppose, for example, that on a given interval the second dérivative 
is everywhere positive. Then the first dérivative increases and therefore 



f'(x) — tan a increases and the angle a of inclination of the tangent line 
itself increases (figure 17). Thus as we move along the curve it keeps 
turning constantly to the same side, namely upward, and is thus, as they 
say, “convex downward.” 

On the other hand, in a part of a curve where the second dérivative is 
constantly négative (figure 18) the graph of the function is “convex 
upward.”* 



* Strictly defined, the "convexity upward" is that property of the curve that consists 
of its lying above (more precisely “not below") the chord joining any two of its points; 
analogously, for “convexity downward” (which is also simply called “concavity”), 
the curve does not lie above its chords. 
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Criteria for maxima and minima; study of the graphs of curves. If 

throughout the whole interval over which x varies the curve is convex 
upward and if at a certain point x 0 of this interval the dérivative is equal 
to zéro, then at this point the function necessarily attains its maximum; 
and its minimum in the case of convexity downward. This simple con¬ 
sidération often allows us, after finding a point at which the dérivative 
is equal to zéro, to décidé thereupon whether at this point the function 
has a local maximum or minimum.* 


Example 1 . Let us study the appearance of the graph of the function 


/W = y - =j- + 6x - 2. 


We take its first dérivative and set it equal to zéro, 

/'(*) = x* - 5* + 6 = 0. 

The roots of the équation obtained in this way are x, = 2, x 2 = 3. The 
corresponding values of the function are 

/(2) = 2§,/(3) = 2£. 

We then mark these two points on the diagram. Along with these we 
may also mark the point with coordinates x = 0 and y = /(O) = — 2 
where the graph intersects the y-axis. The second dérivative is f"(x) 
= 2x — 5. This reduces to zéro for x = §, so that 

/"(*) > 0 for x > | , 

f"(x) < 0 for x < |. 

The point 



is a point of infection of the graph. To the left of this point the curve is 
convex upward, and to the right it is convex downward. 

It is now évident that the point x = 2 is a maximum point and 
the point x = 3 is a minimum point for the function. 


* In more complicated cases, where the second dérivative itself changes sign, the 
problem of determining the character of the stationary point is solved by means of the 
Taylor formula (§9). 
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On the basis of these 
results we conclude that the 
graph of the function y — f(x) 
has the appearance sketched 
in figure 19. To the right 
of the point (0, —2) the 
curve rises with increasing 
x, is convex upwards, and 
attains its maximum at the 
point (2, 2§), after which it 
begins to fall. At the point 
( 2 i 2 ^X where f'\x) = 0, 
the convexity changes to 
concavity. Then at the point 
(3, 2\) the function attains 
its minimum and from there 
on rises to infinity. The 
final statement cornes from 
the fact that the first term 
of the function, the one 
containing the highest (third) 
power of x, approaches in- 
Fig. 19. finity faster than the second 

and third terms. For the same 
reason the graph of the function approaches — oo as x assumes numerically 
larger négative values. 

Example 2. We shall prove the inequality e* > 1 + x for arbitrary 
x. For this purpose we consider the function /(x) = e 1 — x ~ 1. Its 
first dérivative is f'(x) = e 1 — 1 , which reduces to zéro only for x — 0. 
The second dérivative f"(x) = e* > 0 for ail x. Consequently the graph 
of the function f(x) is convex downward. The number /(O) = 0 is a 
minimum for the function and e* — x — I > 0 for ail x. 

The study of graphs has many different purposes. They often show 
very clearly, for example, the number of real roots of a given équation. 
Thus, in order to demonstrate that the équation 

xe 1 = 2 

has a single real root, we may study the graphs of the functions y = e 1 
and y = 2/x (as sketched in figure 20). It is easy to see that these graphs 
intersect at only one point, so that the équation e z = 2/x has exactly one 
root. 
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The methods of analysis are extensively applied to questions of approxi- 
mate calculation of the roots of an équation. On this subject, see 
Chapter IV, §5. 


§8. Incrément and Differential of a Function 


The differential of a function. Let us consider a function y = f(x) 
that has a dérivative. The 
incrément of this function 

Ay = f(x + Ax) - f(x). 


corresponding to the in¬ 
crément Ax, has the 
property that the ratio 
Ay/Ax , as Ax -* 0, ap- 
proaches a finite limit, 
equal to the dérivative 


This fact may be written 
as an equality 



where the value of a dépends on Ax in such a way that as Ax -* 0, a also 
approaches zéro. Thus the incrément of a function may be represented 
in the form 

Ay = f'(x) Ax + aAx, 


where a -* 0, if Ax -* 0. 

The first summand on the right side of this equality dépends on Ax 
in a very simple way, namely it is proportional to Ax. It is called the 
differential of the function, at the point x, corresponding to the given 
incrément Ax, and is denoted by 

dy =/'(-*■) Ax. 

The second summand has the characteristic property that, as Ax -*■ 0, 
it approaches zéro more rapidly than Ax, as a resuit of the presence of the 
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factor a. It is therefore said to be an infinitésimal of higher order than 
Ax and, in case/'(*) 0, it is also of higher order than the first summand. 

By this we mean that for 
sufficiently small Ax the 
second summand is small 
in itself and its ratio to 
Ax is also arbitrarily small. 

The décomposition of Ay 
into two summands, of 
which the first (the principal 
part) dépends linearly on 
Ax and the second is 
negligible for small Ax, may 
be illustrated by figure 21. 
The segment BC = Ay, 
where BC — BD + OC, BD = tan j 8 • Ax = f'(x) Ax dy, and DC 
is an infinitésimal of higher order than Ax. 

In practical problems the differential is often used as an approximate 
value for the incrément in the function. For example, suppose we hâve 
the problem of determining the volume of the walls of a closed cubical 
box whose interior dimensions are 10 x 10 x 10 cm and the thickness 
of whose walls is 0.05 cm. If great accuracy is not required, we may argue 
as follows. The volume of ail the walls of the box represents the incrément 
Ay of the function y = x 3 forx = 10 and Ax = 0.1. So we find approxi- 
mately 

Ay aa dy — (x 3 )' Ax = 3x 2 Ax = 3 - 10 2 • 0.1 = 30 cm 3 . 



For symmetry in the notation it is customary to dénoté the incrément 
of the independent variable by dx and to call it also a differential. With 
this notation the differential of the function may be written thus: 

dy = f\x) dx. 

Then the dérivative is the ratio /'(*) = dy/dx of the differential of the 
function to the differential of the independent variable. 

The differential of a function originated historically in the concept 
of an “indivisible.” This concept, which from a modem point of view 
was never very clearly defined, was in its time, in the 18th century, a 
fundamental one in mathematical analysis. The ideas concerning it hâve 
undergone essential changes in the course of several centuries. The 
indivisible, and later the differential of a function, were represented as 
actual infinitesimals, as something in the nature of an extremely small 
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constant magnitude, which however was not zéro. The définition given 
in this section is the one accepted in present-day analysis. According to this 
définition the differential is a finite magnitude for each incrément Ax 
and is at the same time proportional to Ax. The other fundamental 
property of the differential, the character of its différence from Ay , may 
be recognized only in motion, so to speak: if we consider an incrément 
Ax which is approaching zéro (which is infinitésimal), then the différence 
between dy and Ay wiil be arbitrarily small even in comparison with 
Ax. 

This substitution of the differential in place of small incréments of the 
function forms the basis of most of the applications of infinitésimal 
analysis to the study of nature. The reader will see this in a particularly 
clear way in the case of differential équations, dealt in this book in Chapters 
V and VI. 

Thus, in order to détermine the function that represents a given physical 
process, we try first of ail to set up an équation that connects this function 
in some definite way with its dérivatives of various orders. The method 
of obtaining such an équation, which is called a differential équation, 
often amounts to replacing incréments of the desired functions by their 
corresponding differentials. 

As an example let us solve the 
following problem. In a rectangu- 
lar system of coordinates Oxyz, 
we consider the surface obtained 
by rotation of the parabola whose 
équation (in the Oyz plane) is 
z = y 2 . This surface is called a 
paraboloid of révolution (figure 
22). Let v dénoté the volume of the 
body bounded by the paraboloid 
and the plane parallel to the Oxy 
plane at a distance z from it. 

It is évident that y is a function 
of z (z > 0). 

To détermine the function v, we Fig. 22. 

attempt to find its differential dv. 

The incrément Av of the function v at the point z is equal to the volume 
bounded by the paraboloid and by two planes parallel to the Oxy plane 
at distances z and z + Az from it. 

It is easy to see that the magnitude of Av is greater than the volume 
of the circular cylinder of radius y/z and height Az but less than that of the 
circular cylinder with radius y/z + Az and height Az. 
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Thus 

nzAz < Av < n(z + Az) Az 

and so 

Av = 7 t(z + 8 Az) Az = nz Az + ”8 Az 1 , 

where 8 is some number depending on Az and satisfying the inequality 

0 < 8 < 1 . 

So we hâve succeeded in representing the incrément Av in the form 
of a sum, the first summand of which is proportional to Az, while the 
second is an infinitésimal of higher order than Az (as Az —» 0). It follows 
that the first summand is the differential of the function v 


or 


dv = nz Az, 


dv = nz dz. 


since Az = dz for the independent variable z. 

The équation so obtained relates the differentials dv and dz (of the 
variables v and z) to each other and thus is called a differential équation. 
If we take into account that 


dv 

dz~ V ' 

where v' is the dérivative of v with respect to the variable z, our differential 
équation may also be written in the form 

i>' = -nz. 

To solve this very simple differential équation we must find a function 
of z whose dérivative is equal to nz. Problems of this sort are treated in a 
general way in §§10 and II, but for the moment we urge the reader to 
verify that a solution of our équation is given by v — nz 2 /2 + C, where 
for C we may choose an arbitrary number.* In our case the volume of 
the body is obviously zéro for z = 0 (see figure 22), so that C = 0. Thus 
our function is given by v = nz 2 /2. 

The mean value theorem and examples of its application. The differ¬ 
ential expresses the approximate value of the incrément of the function 
in terms of the incrément of the independent variable and of the dérivative 
at the initial point. So for the incrément from x — a to x — b, we hâve 

f(b) — f(a) æ f\a)(b - a). 


* This formula gives ail the solutions. 
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It is possible to obtain an exact équation of this sort if we replace the 
dérivative/'(fl) at the initial point by the dérivative at some intermediate 
point, suitably chosen in the interval (a, b). More precisely: If y = f(x) 
is a function which is différentiable on the inter val a < x ^ b, then there 
exists a point strictly within this inter val, such thaï the following exact 
equality holds 

Rb) - M = fU*b - a). (22) 

The géométrie interprétation of this “mean-value theorem" (also called 
Lagrange’s formula or the finite-difference formula) is extraordinarily 
simple. Let A, B be the points on the graph of the function f(x) which 
correspond to x = a and x — b, and let us join A and B by the chord 
A B (figure 23). Now let us move the straight line AB, keeping it constantly 



parallel to itself, up or down. At the moment when this straight line cuts 
the graph for the last time, it will be tangent to the graph at a certain 
point C. At this point (let the corresponding abscissa be x = £), the 
tangent line will form the same angle of inclination a as the chord AB. 
But for the chord we hâve 


tan a = 


f(b) -fia) 
b — a 


On the other hand at the point C 


tan a =/'(£). 
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This équation 


is exactly the mean-value theorem.* 

Formula (22) has the peculiar feature that the point £ appearing in it 
is unknown to us; we know only that it lies “somewhere in the interval 
(a, b).” But in spite of this indeterminacy, the formula has great theoretical 
significance and is part of the proof of many theorems in analysis. The 
immédiate practical importance of this formula is also very great, since 
it enables us to estimate the increase in a function when we know the 
limits between which its dérivative can vary. For example, 

| sin b — sin a | = | cos £ | (b — a) ^ b — a. 

Here a, b and £ are angles, expressed in radian measure; £ is some value 
between a and b\ £ itself is unknown, but we know that | cos £ | < 1. 

From formula (22) it is clear that a function whose dérivative is every- 
where equal to zéro must be a constant; at no part of the interval can it 
receive an incrément different from zéro. Analogously, the reader will 
easily prove that a function whose dérivative is everywhere positive must 
everywhere increase, and if its dérivative is négative, the fonction must 
decrease. We give here without proof one of the many generalizations 
of the mean-value theorem. 

For arbitrary fonctions <f(x) and ip(x) différentiable in the interval [a, b], 
provided only that ip'(x) ^ 0 in (a, b), the following équation + holds 


m - <K°) 

m-m f(£) • 


where £ is some point in the interval (a, b). 1 

From this theorem we can dérivé a general method for calculating the 
limits of an expression like 


<K*) 


lim i/ \ 
</<*) 


(24) 


* Of course these arguments only give a géométrie interprétation of the theorem 
and by no means form a rigorous proof. 

t Formula (23) can be derived by a simple application of the mean-value theorem 
to the function 


f(x) = *U) - 


m - #a> 

•Hb) - m 


<H*). 


t By the symbols [a, 6] and (a, b) we dénoté the sets of values of x satisfying the 
inequalities a < x < b and a < x < b respectively. 
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if <f>( 0) = i/i(0) = 0. From formula (23) we hâve 

#*) = 4t*) - m = 
m m - m no ’ 

where f is between 0 and x, and therefore £ -*■ 0 together with x. This 
allows us to calculate the limit 


lim 

x-»0 


4>\x) 

no ' 


instead of the limit (24), which is in many cases very much easier.* 


Example. Let us find the lim 

Z-M 

three times, we hâve successively 
x 


x — sin x 


lim 

*-a> 


sin x 1 — cosx .. 

= lim- z — = lim 


»-o 


3.x 2 


*-.0 


. By making use of the rule 

sin x j. cos x _ 1 
6x *-*o 6 6 


§9. Taylor’s Formula 

The function 

p(x) = a 0 + a,x + agx 2 + ••• + a„x”, 

where the coefficients a k are constants, is called a polynomial of degree 
n. In particular, y = ax + b is a polynomial of the first degree and 
y = ax 2 + bx + c is a polynomial of the second degree. Polynomials 
may be considered as the simplest of ail functions. In order to calculate 
their value for a given x, we require only the operations of addition, 
subtraction, and multiplication; not even division is needed. Polynomials 
are continuous for ail x and hâve dérivatives of arbitrary order. Also, 
the dérivative of a polynomial is again a polynomial, of degree lower by 
one, and the dérivatives of order n 4- 1 and higher of a polynomial of 
degree n are equal to zéro. 

If to the polynomials we adjoin functions of the form 

_ flp + a t X + - + OnX” 

b 0 + b,x + ■■■ +b m x m ' 


* The same rule is valid for finding the limit of a fractional expression in which 
the numerator and the denominator both approach infinity. This method, which is 
very convenient for finding such limits (or, as we say, for the removal of indeterminacies), 
will be used, for example, in §3 of Chapter XII. 
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for the calculation of which we also need division, and also the functions 
Vx and Vx and, finally, arithmetical combinations of these functions, 
we obtain essentially ail the functions whose values can be calculated 
by methods learned in the secondary school. 

While we were still in school, we formed some notion of a number of 
other functions, like 


Vx, log x, sin x, arc tan x, •••. 

But though we became acquainted with the most important properties 
of these functions, we found no answer in elementary mathematics to the 
question: How can we calculate them ? What sort of operations, for 
example, is it necessary to perform on x in order to obtain log x or sin x ? 
The answer to this question is given by methods that hâve becn worked 
out in analysis. Let us examine one of these methods. 

Taylor’s formula. On an interval containing the point a, let there be 
given a function /(x) with dérivatives of every order. The polynomial 
of first degree 

Pi(x) =Aa) +f'(a)(x - a) 

has the same value as/l» at the point x = a and also, as is easily verified, 
has the same dérivative as f(x) as this point. Its graph is a straight line, 
which is tangent to the graph of/(» to the point a. It is possible to choose 
a polynomial of the second degree, namely 

pM = f{a) +f'(a)(x - a) - a) 2 , 

which at the point of x = a has with f(x) a common value and a common 
first and second dérivative. Its graph at the point a will follow that 
of f(x) even more closely. It is natural to expect that if we construct a 
polynomial which at x = a has the same first n dérivatives as f(x) at the 
same point, then this polynomial will be a still better approximation to 
f(x) at points x near a. Thus we obtain the following approximate equality, 
which is Taylor’s formula 

A*) ~/(«) +/'(*)(* - a) - a) 2 + - + - a)”. (25) 

The right side of this formula is a polynomial of degree n in (x — a). 
For each x the value of this polynomial can be calculated if we know the 
values of/(a),/'(o), •••,/ (nl (a). 
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For functions which hâve an (n + l)th dérivative, the right side of this 
formula, as is easy to show, differs from the left side by a small quantity 
which approaches zéro more rapidly than (x — a)". Moreover, it is the 
only possible polynomial of degree n that differs from f(x ), for x close 
to o, by a quantity that approaches zéro, as x -*■ a, more rapidly than 
(x — fl) n . If f(x) itself is an algebraic polynomial of degree n, then the 
approximate equality (25) becomes an exact one. 

Finally, and this is particularly important, we can give a simple ex¬ 
pression for the différence between the right side of formula (25) and the 
actual value of f(x). To make the approximate equality (25) exact, we 
must add to the right side a further term, called the “remainder term” 

Âx) =Aà) +f(aXx - a) + ■•• + f -^(x - a)» - fl )" +1 

n! (» + D! (26) 

This final supplementary term* 

f tn * »fA 

J *-* rfaK) "iir+i@ <ïr ” 


has the peculiarity that the dérivative appearing in it is to be calculated 
in each case not at the point a but at a suitably chosen point £, which 
is unknown but lies somewhere in the interval between a and x. 

The proof of equality (26) is rather cumbersome but quite simple in 
essence. We shall give here a somewhat artificial version of the proof, 
which has the merit of being concise. 

In order to find out by how much the left side in the approximate 
formula (25) differs from the right, Iet us consider the ratio of the différence 
between the two sides in equality (25) to the quantity — (x — a) n+l 


Ax) - [/(«) +f'(a)(x -«) + - + 7 -^(* - «)"] 

—(x — fl)" +l 


(27) 


We also introduce the function 

<ttu) =/(«) +rwx - fl) + - + f -^{x - a)» 

n\ 

of a variable «, taking x to be fixed (constant). Then the numerator in 
(27) will represent the increase of this function as we pass from u = a 
to u = x, and the denominator will be the increase over the same interval 
of the function 

<IXu) = (x — «)"+*. 

* This is only one of the possible forms for the remainder term 
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We now make use of the generalized mean-value theorem quoted earlier 

Üx) - Ha) _ f (fl 
</j(x) - <p(a) 

Differentiating the functions <f{u ) and <fi(u) with respect to u (it must be 
recalled that the value of x has been fixed) we find that 


4>W _ /"+"(€) 
m) (« + D! ' 


The equality of this last expression with the original quantity (27) gives 
Taylor’s formula in the form (26). 

In the form (26) Taylor’s formula not only provides a means of approx- 
imate calculation of f(x) but also allows us to estimate the error. Let us 
consider the simple example 

y = sin x. 

The values of the function sin x and of its dérivatives of arbitrary order 
are known for x = 0. Let us make use of these values to write Taylor’s 
formula for sin x, choosing a = 0 and limiting ourselves to the case 
n = 4. We find successively 


/(*) = sin *, 
/'"(*) = — cosx, 
/(O) = 0 
/"'( 0 )= - 1 . 


f\x) = cos X, 
/‘V) = sin 
/'(O) = 1, 

/ ,v ( 0 ) = 0, 


f'\x) = —sin x 

/ V (AT) = COS AT ; 

/"(O) = 0, 

Pit) = cos f 


Therefore 

sin x = x 



where 





f. 


Although the exact value R t is unknown, still we can easily estimate it 
from the fact that | cos £ | < 1. For ail values of x between 0 and tt/4 
we hâve 


I R,\ 



-L 

120 14 / 


< 


_L 

400' 


Consequently, on the interval [0, jt/ 4] the function sin x may be considered, 
with accuracy up to as equal to the polynomial of third degree 


1 

sin x = x — y 
o 
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If we were to take more terms in Taylor’s expansion for sin x, we would 
obtain a polynomial of higher degree which would approximate sin x 
still more closely. 

The tables for trigonométrie and other functions are calculated by 
similar methods. 

The laws of nature, as a rule, can be expressed with good approximation 
by functions that may be differentiated as often as we like and that in 
their turn may be approximated by polynomials, the degree of the polynom¬ 
ial being determined by the accuracy desired. 

Taylor’s sériés. If in formula (25) we take a larger and larger number 
of terms, then the différence between the right side and /(x), expressed 
by the remainder term R ntl (x), may tend to zéro. Of course this will not 
always occur: neither for ail functions nor for ail values of x. But there 
exists a broad class of functions (the so-called analytic functions) for 
which the remainder term R„ tl (x) does in fact approach zéro as n — oo, 
at least for ail values of x within a certain interval around the point a. 
For these functions the Taylor formula allows us to calculate f(x) with 
any desired degree of accuracy. Let us examine such functions more 
closely. 

If /?„,,(*) -*• 0 as n -* oo, then from (26) it follows that 

f(x) = lim \f{a) +f\a)(x - a) + - + f -^(x - a)»l 

n-*co i tv. * 


In this case we say that/fa) has been expanded in a convergent infinité 
sériés 

/(x) =f(a) +f'(a)(x - a) + ^-(x - af + -, 


in increasing powers of (x — a). This sériés is called a Taylor sériés, and 
f(x) is said to be the sum of the sériés. Let us consider some examples 
(with a = 0): 


i. a + *>■ -1 + « + *■ + «• + ■■■ 

(valid for | x \ < 1 and for arbitrary real n). 


sin x = 


_ *~3! + 5! - 7! + '' 


2 . 


(valid for ail x). 
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3. cos x = 1 — ~ ^ — £ + ••• (valid for ail x). 

2! 4! 6! 

4. e*=l+*+^ + ^ + -" (valid for ail x). 

AT 3 AT 5 

5. arc tan x = x — y + — — (valid for | x | ^ 1). 

The first of these examples is the famous binomial theorem of Newton, 
which was obtained by Newton for ail n but completely proved in his 
time only for intégral n. This example served as a model for the establish¬ 
ment of the general Taylor formula. The last two formulas allow us, 
for x = 1, to calculate with arbitrarily good approximation the numbers 
e and tt. 

The Taylor formula, which opens up the way for most of the calculations 
in applied analysis, is extremely important from the practical point of 
view. 

Many of the laws of nature, physical and Chemical processes, the motion 
of bodies, and the like, are expressed with great accuracy by functions 
which may be expanded in a Taylor sériés. The theory of such functions 
can be formulated in a clearer and more complété way if we consider them 
as functions of a complex variable (see Chapter IX). 

The idea of approximating a function by polynomials or of representing 
it as the sum of an infinité number of simpler functions underwent far- 
reaching developments in analysis, where it now forms an independent 
branch, the theory of approximation of functions (see Chapter XII). 


§10. Intégral 

From Chapter I and from §1 of the présent chapter the reader already 
knows that the concept of the intégrai, and more generally of the intégral 
calculus, had its historical origin in the need for solving concrète problems, 
a characteristic example of which is the calculation of the area of a cur- 
vilinear figure. The présent section is devoted to these questions. In it 
we will also discuss the aforementioned connection between the problems 
of the differential and the integra! calculus, which was not fully cleared 
up until the 18th century. 

Area. Let us suppose that a curve above the x-axis forms the graph 
of the function y = f(x). We attempt to find the area S of the segment 
bounded by the line y = /(x), by the x-axis and by the straight Unes drawn 
through the points x = a and x = b parallel to the >>-axis. 
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To solve this problem we proceed as follows. We divide the 
interval [a, b] into n parts, 
not necessarily equal. We de- 
note the length of the first 
part by Ax, , of the second by 
Ax 2 , and so forth up to the 
final part Ax n . In each segment 
we choose points f,, £ 2 . . 

and set up the sum 

S- =/(£,M*, +/(£) àxt 

+ - +AUAx n . (28) 

The magnitude S n is obviously 
equal to the sum of the areas of 
the rectangles shaded in figure Fig. 24. 

24. 

The finer we make the subdivision of the segment [a, 6], the doser S„ 
will be to the area S. If we carry out a sequence of such constructions, 
dividing the interval [a, 6] into successively smaller and smaller parts, 
then the sums S n will approach S. 

The possibility of dividing [a, b) into unequal parts makes it necessary 
for us to define what we mean by “successively smaller” subdivisions. 
We assume not only that n increases beyond ail bounds but also that the 
length of the greatest Ax, in the nth subdivision approaches zéro. Thus 



5 = lim [ j(Ç x )Ax x +A(t)Ax , + - + f(L)Ax n ) 
m«x Jx,-*0 

= lim %M,)Ax t . (29) 

mix Ax ,-0 T*. 

1 i-l 


The calculation of the desired area has in this way been reduced to 



finding the limit (29). 

We note that when we first set up 
the problem, we had only an 
empirical idea of what we mean by 
the area of our curvilinear figure, 
but we had no précisé définition. 
But now we hâve obtained an 
exact définition of the concept 
of area: It is the limit (29). 
We now hâve not only an in¬ 
tuitive notion of area but also a 
mathematical définition, on the 
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basis of which we can calculate the area numerically (compare the 
remarks at the end of §3, concerning velocity and the length of a 
circumference). 

We hâve assumed that /(x) ^ 0. If /(x) changes sign, then in 
figure 25, the limit (29) will give us the algebraic sum of the areas 
of the segments lying between the curve y = /(x) and the x-axis, where 
the segments above the x-axis are taken with a plus sign and those below 
with a minus sign. 

Definite intégral. The need to calculate the limit (29) arises in many 
other problems. For example, suppose that a point is moving along a 
straight line with variable velocity v = fit). How are we to détermine 
the distance s covered by the point in the time from t = a to t — bl 

Let us assume that the function /(f) is continuous; that is, in small 
intervals of time the velocity changes only slightly. We divide the interval 
[a, b] into n parts, of length At ,, Ar t , •••, At n . To calculate an approxim- 
ate value for the distance covered in each interval At t , we will suppose 
that the velocity in this period of time is constant, equal throughout to 
its actual value at some intermediate point . The whole distance 
covered will then be expressed approximately by the sum 

i-i 


and the exact value of the distance s covered in the time from a to b, will 
be the limit of such sums for finer and finer subdivisions; that is, it will 
be the limit (29) 


s = 


lim 

mtx 


ùàu. 

i-i 


lt would be easy to give many examples of practical problems leading 
to the calculation of such a limit. We will discuss some of them later, 
but for the moment the examples already given will sufficiently indicate 
the importance of this idea. The limit (29) is called the definite intégral 
of the function /(x) taken over the interval [a, b], and it is denoted by 

f f(x)dx. 

J a 

The expression /(x) dx is called the integrand, a and b are the limits of 
intégration; a is the lower limit, b is the upper limit. 
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The connection between differential and intégral calculus. As an 

example of the direct calculation of a definite intégral, we may take 
example 2, §1. We may now say that the problem considered there reduces 
to calculation of the definite intégral 

I ax dx. 

J o 

Another example was considered in §3, where we solved the problem 
of finding the area bounded by the parabola y = x *. Here the problem 
reduces to calculation of the intégral 


[' x* dx. 

We were able to calculate both these intégrais directly, because we hâve 
simple formulas for the sum of the first n natural numbers and for the 
sum of their squares. But for an arbitrary function f(x), we are far from 
being able to add up the sum (28) (that is, to express the resuit in a simple 
formula) if the points and the incréments Ax { are given to suit some 
particular problem. Moreover, even when such a summation is possible, 
there is no general method for carrying it out; various methods, each 
of a quite spécial character, must be used in the various cases. 

So we are confronted by the problem of finding a general method for 
the calculation of definite intégrais. Historically this question interested 
mathematicians for a long period of time, since there were many practical 
aspects involved in a general method for finding the area of curvilinear 
figures, the volume of bodies bounded by a curved surface, and so forth. 

We hâve already noted that Archimedes was able to calculate the area 
of a segment and of certain other figures. The number of spécial problems 
that could be solved, involving areas, volumes, centers of gravity of solids, 
and so forth, gradually increased, but progress in finding a general method 
was at first extremely slow. The general method could not be discovered 
until sufficient theoretical and computational material had been accumulat- 
ed through the demands of practical life. The work of gathering and 
generalizing this material proceeded very gradually until the end of the 
Middle Ages; and its subséquent energetic development was a direct 
conséquence of the rapid growth in the productive powers of Europe 
resulting from the breakup of the former (feudal) methods of manufactur- 
ing and the création of new ones (capitalistic). 

The accumulation of facts connected with definite intégrais proceeded 
alongside of the corresponding investigations of problems related to the 
dérivative of a function. The reader already knows from §1 that this 
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immense preparatory labor was crowned with success in the 17th century 
by the work of Newton and Leibnitz, lt is in this sense that Newton and 
Leibnitz are the creators of the differential and intégral calculus. 

One of the fundamental contributions of Newton and Leibnitz consists 
of the fact that they finally cleared up the profound connection between 
differential and intégral calculus, which provides us, in particular, with 
a general method of calculating definite intégrais for an extremely wide 
class of functions. 

To explain this connection, we turn to an example from mechanics. 

We suppose that a material point is moving along a straight line with 
velocity v = f(t), where t is the time. We already know that the distance 
a covered by our point in the time between / = /, and t = i t is given 
by the definite intégral 



Now let us assume that the law of motion of the point is known to us; 
that is, we know the function s = F(t) expressing the dependence on the 
time t of the distance s calculated from some initial point A on the straight 
line. The distance a covered in the interval of time [f,, f z ] is obviously 
equal to the différence 

<7 = F['t) - Fit*)- 

In this way we are led by physical considérations to the equality 

/‘'/(O* = F(/ 2 )-F(r,), 

which expresses the connection between the law of motion of our point 
and its velocity. 

From a mathematical point of view the function F\i), as we already 
know from §5, may be defined as a function whose dérivative for ail 
values of t in the given interval is equal to/(r), that is 

p\t)=m 

Such a function is called a primitive for f(t). 

We must keep in mind that if the function f(t) has at least one primitive, 
then along with this one it will hâve an infinité number of others; for if 
F(t) is a primitive for f{t), then F(t) + C, where C is an arbitrary constant, 
is also a primitive. Moreover, in this way we exhaust the whole set of 
primitives for /(/), since if F,(r) and Fjj) are primitives for the same 
function /(/), then their différence <f>(t) = F,(t) — F z (r) has a dérivative 
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4>'{t) that is equal to zéro at every point in a given interval so that <£(/) 
is a constant.* 

From a physical point of view the various values of the constant C 
détermine laws of motion which differ from one another only in the fact 
that they correspond to ail possible choices for the initial point of the 
motion. 

We are thus led to the resuit that for an extremely wide class of functions 
f(x), including ail cases where the fonction/{jr) may be considered as the 
velocity of a point at the time x, we hâve the following equality* 

ff(x)dx = F(b)-F(a), (30) 

where F(x) is an arbitrary primitive for f(x). 

This equality is the famous formula of Newton and Leibnitz, which 
reduces the problem of calculating the definite intégral of a fonction to 
finding a primitive for the fonction and in this way forms a link between 
the differential and the intégral calculus. 

Many particular problems that were studied by the greatest mathematic- 
ians are automatically solved by this formula, stating that the definite 
intégral of the fonction /(*) on the interval [a, b) is equal to the différence 
between the values of any primitive at the left and right ends of the 
interval. 1 lt is customary to write the différence (30) thus: 


F(x) 


* 

a 


F(b ) - F(a). 


Example 1. The equality 



shows that the fonction x*/3 is a primitive for the fonction x 2 . Thus, by the 
formula of Newton and Leibnitz, 



* By the mean value theorem 

■KD - H'») = - /.) = 0, 

when v lies between t and i 0 . Thus Ht) = H'o) = const for ail t. 

t It is possible to prove mathcmatically, without recourse to examples from mechan- 
ics, that if the function f(x) is continuous (and even if it is discontinuous but Lebesgue- 
summable; see Chapter XV) on the interval [a, 6], then there exists a primitive F(x) 
satisfying equality (30). 

t This formula has been generalized in various ways (see for example §13, the formula 
of Ostrogradskii). 
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Example 2. Let c and c' be two electric charges, on a straight line 
at distance r from each other. The attraction F between them is directed 
along this straight line and is equal to 


F = 


a 

7 * 


(a = kcc', where A: is a constant). The work W done by this force, when 
the charge c remains fixed but c moves along the interval [Æ,, Æ 2 ], may 
be calculated by dividing the interval [Æ, , R 2 ] into parts Jr,. On each 
of these parts we may consider the force to be approximately constant, 
so that the work done on each part is equal to a/r* Jr t . Making the parts 
smaller and smaller, we see that the work W is equal to the integra! 


W= limy-Jzlr, = f’4 dr. 


The value of this intégral can be calculated at once, if we recall that 


so that 





ln particular, the work done by a force F as the charge c', initially at a 
distance R t from c, moves out to infinity, is equal to 



From the arguments given above for the formula of Newton and 
Leibnitz, it is clear that this formula gives mathematical expression to an 
actual tie existing in the objective world. It is a beautiful and important 
example of how mathematics gives expression to objective laws. We 
should remark that in his mathematical investigations, Newton always 
took a physical point of view. His work on the foundations of differential 
and intégral calculus cannot be separated from his work on the founda¬ 
tions of mechanics. 

The concepts of mathematical analysis, such as the dérivative or the 
intégral, as they presented themselves to Newton and his contemporaries. 
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had not yet completely “broken away” from their physical and géométrie 
origins, such as velocity and area. In fact, they were half mathematical 
in character and half physical. The conditions existing at that time were 
not yet suitable for producing a purely mathematical définition of 
these concepts. Consequently, the investigator could handle them cor- 
rectly in complicated situations only if he remained in close contact 
with the practical aspects of his problem even during the intermediate 
(mathematical) stages of his argument. 

From this point of view the créative work of Newton was different in 
character from that of Leibnitz.* Newton was guided at ail stages by a 
physical way of looking at the problem. But the investigations of Leibnitz 
do not hâve such an immédiate connection with physics, a fact that in the 
absence of clear-cut mathematical définitions sometimes led him to mis- 
taken conclusions. On the other hand, the most characteristic feature 
of the créative activity of Leibnitz was his striving for generality, his 
efforts to find the most general methods for the problems of mathematical 
analysis. 

The greatest merit of Leibnitz was his création of a mathematical 
symbolism expressing the essence of the matter. The notations for such 
fundamental concepts of mathematical analysis as the differential dx, 
the second differential cPx, the intégral jy dx, and the dérivative d/dx 
were proposed by Leibnitz. The fact that these notations are still used 
shows how well they were chosen. 

One advantage of a well-chosen symbolism is that it makes our proofs 
and calculations shorter and easier; also, it sometimes protects us against 
mistaken conclusions. Leibnitz, who was well aware of this, paid especial 
attention in ail his work to the choice of notation. 

The évolution of the concepts of mathematical analysis (dérivative, 
intégral, and so forth) continued, of course, after Newton and Leibnitz 
and is still continuing in our day; but there is one stage in this évolution 
that should be mentioned especially. Ft took place at the beginning of the 
last century and is related particularly to the work of Cauchy. 

Cauchy gave a clear-cut formai définition of the concept of a limit and 
used it as the basis for his définitions of continuity, dérivative, differential, 
and intégral. These définitions hâve been introduced at the corres- 
ponding places in the présent chapter. They are widely used in present-day 
analysis. 

The great importance of these achievements lies in the fact that it is 
now possible to operate in a purely formai way not only in arithmetic, 


* The discoveries of Newton and Leibnitz were made independently. 
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algebra, and elementary geometry, but also in this new and very extensive 
branch of mathematics, in mathematical analysis, and to obtain correct 
results in so doing. 

Regarding practical application of the results of mathematical analysis, 
it is now possible to say: If the original data are verified in the actual 
world, then the results of our mathematical arguments will also be verified 
there. If we are properly assured of the accuracy of the original data, 
then there is no need to make a practical check of the correctness of the 
mathematical results; it is sufficient to check only the correctness of the 
formai arguments. 

This statement naturally requires the following limitation. In mathe¬ 
matical arguments the original data, which we take from the actual world, 
are true only up to a certain accuracy. This means that at every step 
of our mathematical argument the results obtained will contain certain 
errors, which may accumulate as the number of steps in the argument 
increases.* 

Returning now to the definite intégral, let us consider a question of 
fundamental importance. For what functions f(x ), defined on the interval 
[a, b], is it possible to guarantee the existence of the definite intégral 
Ja/l*) dx, namely a number to which the sum £"_/(£,) Ax, tends as limit 
as max Jx t -* 0? It must be kept in view that this number is to be the 
same for ail subdivisions of the interval [a, 6] and ail choices of the points 
it- 

Functions for which the definite intégral, namely the limit (29), exists 
are said to be intégrable on the interval [a, 6). Investigations carried out 
in the last century show that ail continuous functions are intégrable. 

But there are also discontinuous functions which are intégrable. Among 
them, for example, are those functions which are bounded and either 
increasing or decreasing on the interval [a, b]. 

The function that is equal to zéro at the rational points in [a, b] and 
equal to unity at the irrational points, may serve as an example of a non- 
integrable function, since for an arbitrary subdivision the intégral sum 
s„ will be equal to zéro or unity, depending on whether we choose the 
points as rational numbers or irrational. 

Let us note that in many cases the formula of Newton and Leibnitz 
provides an answer to the practical question of calculating a definite 
intégral. But here arises the problem of finding a primitive for a given 


* For example, it follows formally from a = b and b = c that a = c. But in practice 
this relation appears as follows: From the facts that a = b is known with accuracy 
up to t and b = c is known with the same accuracy, it follows that a = c is known 
with accuracy up to 2«. 
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function; that is, of finding a function that has the given function for its 
dérivative. We now proceed to discuss this problem. Let us note by the 
way that the problem of finding a primitive has great importance in other 
branches of mathematics also, particularly in the solution of differential 
équations. 


§11. Indefinite Intégrais; the Technique of Intégration 

An arbitrary primitive of a given function f(x) is usually called an 
indefinite integraI of fix) and is written in the form 


\f{x)dx. 


In this way, if F\x) is a completely determined primitive of/(*), then the 
indefinite intégral of f(x) is given by 


jf(x) dx = F(x) + C, 


(31) 


where C is an arbitrary constant. 

Let us also note that if the function /(*) is given on the interval [a, b] 
and, if F(x) is a primitive for f(x) and x is a point in the interval [a, b], 
then by the formula of Newton and Leibnitz we may write 

F(x) = F(a) + f fit) dt. 


Here the intégral on the right side differs from the primitive F(x) only 
by the constant F(a). In such a case this intégral, if we consider it as a 
function of its upper limit x (for variable *), is a completely determined 
primitive of /(x). Consequently, an indefinite intégral of /(x) may also 
be written as follows: 


ff(x) dx = fAO dt + C, 


where C is an arbitrary constant. 

Let us set up a fundamental table of indefinite intégrais, which 
can be obtained directly from the corresponding table of dérivatives 
(see §6): 
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J = T+T + C(a ^ _1) ’ 

jÇ=ln|jr| + C,* 

/ Û ‘‘* = i£r + C ’ 

| e* dx = c 1 + C, 

J sin x dx — —cos a: + C, 

J cos x dx = sin at + C, 

| sec* x dx = tan jr + C, 

r dx . _ 

-■ - = arc sin x + C 

J V 1 — AT* 

= —arc cos x + C, (c, — C = y), 

Jrï^ = arctan * + C 


> (32) 


The general properties of indefinite intégrais may also be deduced from 
the corresponding properties of dérivatives. For example, from the rule 
for the différentiation of a sum we obtain the formula 


j [/(at) ± 4>{x)) dx = j f(x) dx±j 4<x) dx + C, 

and from the corresponding rule expressing the fact that a constant factor 
k may be taken outside the sign of différentiation we get 

| kf(x) dx = k | f{x) dx + C. 


For example, 


/(w + ax-^+î-.) 


dx 


,x 3 2x 2 . x-'/*+' ., . . 

= 3 -y + — - 3 _ - — - + 4 ln | AT | - AT + c. 


* For x > 0, (In I jc I)' =- (ln x)' = 1 /jc; for x < 0, (ln I x I)' = [ln(— x)]’ = 
ll-xi-l) = Mx. 
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There are a number of methods for calculating indefinite intégrais. Let 
us consider one of them, namely the method of substitution or change of 
variable, which is based on the following equality 

J 7 « dx = dt + c, (33) 

where x = <f>(t) is a différentiable function. The relation (33) is to be 
understood in the sense that if in the function 


F(x) = jf(x)dx, 

on the left side of equality (33), we set x = <f>(t), we thereby obtain a 
function F[<f>(t)] whose dérivative with respect to t is equal to the expression 
under the sign of intégration on the right side of equality (33). This fact 
follows immediately from the theorem on the dérivative of a function 
of a function. 

Let us give some examples of this method of substitution 


(substitution of kx = t, from which k dx = dt). 


f A , = ~ j dt = -t + C = - Va* - x* + C 
J Va* - x* J 

(substitution of t = Va* — x*, from which dt = - * ? X „ . 

' Va* — x* ' 

J Va* — x* dx = j Va* — a* sin ! u a cos u du — a* J" cos 2 u du 
1 -f cos2u . a* i sin 2«\ _ 

-2-*-T ( w + -T-) + C 


= a*j- 


= -y (« + sin u cos u) + C 
= y (arc sin^ + ^ Vu 2 - x*) + C 


(substitution of x — a sin h). 

As can be seen from these examples, the method of substitution or 
change of variables greatly extends the class of elementary functions 
that we are able to integrate; that is, for which we can find primitives 
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that are themselves elementary fonctions. But it must be noted that from 
the point of view of actually calculating the resuit, we are in a much worse 
position, generally speaking, with respect to intégration than for différen¬ 
tiation. 

From §6 we know that the dérivative of an arbitrary elementary fonction 
is itself an elementary fonction, which we may efTectively calculate by 
making use of the rules of différentiation. But the converse statement 
is in general untrue, since there exist elementary fonctions whose indefinite 
intégrais are not elementary fonctions. Examples are e~ xt , l/(ln jc), (sin x)jx 
and so forth. To obtain intégrais of these fonctions we must make use of 
approximative methods and also introduce new fonctions which can not 
be reduced to elementary ones. We can not spend more time here on this 
question but must simply note that even in elementary mathematics it is 
possible to find many examples in which a direct operation can be carried 
out on a certain class of numbers, while the inverse operation can not 
be carried out on the same class; thus, a square of an arbitrary rational 
number is again a rational number, but the square root of a rational 
number is by no means always rational. Analogously, différentiation of 
elementary fonctions produces a fonction that is again elementary, but 
intégration may lead us outside the class of elementary fonctions. 

Some of the intégrais that cannot be expressed in terms of elementary 
fonctions hâve great importance in mathematics and its applications. 
An example is 



which plays a very important rôle in the theory of probability (see Chapter 
XI). Other examples are the intégrais 


, .. . - and Vl — k* sin* 0 d9 (k 2 < 1), 

'o Vl — k* sin* 6 K 

which are called elliptic intégrais of the first and second kind respectively. 
We are led to the calculation of these intégrais by a large number of 
problems in physics (see Chapter V, §1, example 3). Detailed tables of 
these intégrais for various values of the arguments x and <f> hâve been 
calculated by approximate methods but with great accuracy. 

It must be emphasized that the proof of the very fact that a given element¬ 
ary fonction cannot be integrated in terms of elementary fonctions is in 
each case quite difficult. Such questions occupied the attention of out- 
standing mathematicians in the last century and hâve played an important 
rôle in the development of analysis. Fundamental results were obtained 
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here by CebySev, who gave a complété answer to the question of expressing 
in terms of elementary functions the intégrais of the form 

J x m (a + bx‘Y dx, 

where m, s, and p are rational numbers. Up to his time three relations, 
obtained by Newton, were known for the exponents m, s, and p, which 
implied the integrability of this intégral in terms of elementary functions. 
CebySev proved that in ail other cases the intégral cannot be expressed 
in terms of elementary functions. 

We introduce here another method of intégration, namely intégration 
by parts. It is based on the formula we already know 

(uv)' = uv" + uv, 

for the dérivative of the product of the functions u and v. This formula 
may also be written 

uv' = (uv)' — uv. 


Let us now integrate the left and right sides, keeping in mind that 

J (uv)' dx = uv + C. 

We now finally obtain the equality 

J uv' dx = uv — J uv dx, 

which is also called the formula of intégration by parts. We hâve not written 
the constant C since we may consider that it is included in one of the 
indefinite intégrais occurring in this équation. 

Let us introduce some applications of this formula. Suppose we hâve 
to calculate J xe x dx. Here we will take u = x and v' = e z , and thus 
u' = 1, v = e*, and consequently 

| xe x dx = xe 1 - J 1 • e 1 dx = xé 1 - e* + C. 

In the intégral Jln x dx it is convenant to take u = ln*, c' = 1, so 
that u' = \/x, v = x and 


\n x dx = x\n x — \ dx = x ln * — * + C. 
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In the following characteristic example it is necessary to integrate 
twice by parts and then to find the desired intégral from the équations 
so obtained: 

J e z sin x dx = e z sin x — J e z cos x dx 

= e z sin x — e z cos x — J e z sin x dx , 

from which 

C €* 

J e z sin x dx = (sin x — cos *) -f C. 

We end this section here; from it the reader will hâve obtained only a 
superficial idea of the theory of intégration. We hâve not given any atten¬ 
tion to many different methods in this theory. In particular we hâve not 
touched here on the very interesting question of the intégration of rational 
fractions, a theory in which an important contribution was made by the 
well-known mathematician and mechanician, Ostrogradskil. 

§12. Functions of Several Variables 

Up to now we hâve spoken only of functions of one variable, but in 
practice it is offen necessary to deal also with functions depending on 
two, three, or in general many variables. For example, the area of a 
rectangle is a function 

5= xy 

of its base x and its height y. The volume of a rectangular parallelepiped 
is a function 

v = xyz 

of its three dimensions. The distance between two points A and fl is a 
function 


r = V(x t - * 2 )* + (y, - y*)* + (z, - z 2 ) 2 
of the six coordinates of these points. The well-known formula 

pv = RT 

expresses the dependence of the volume v of a definite amount of gas 
on the pressure p and absolute température T. 
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Functions of several variables, like functions of one variable, are in 
many cases defined only on a certain région of values of the variables 
themselves. For example, the function 

u = ln (1 — x 2 — y 2 — z 2 ) (34) 

is defined only for values of x, y and z that satisfy the condition 

** + / + z* < I. (35) 

(For other x, y, z its values are not real numbers.) The set of points of 
space whose coordinates satisfy the inequality (35) obviously fills up a 
sphere of unit radius with its center at the origin of coordinates. The 
points on the boundary are not included in this sphere; the surface of the 
sphere has been so to speak "peeled off.” Such a sphere is said to be open. 
The function (34) is defined only for such sets of three numbers (x , y , z) 
as are coordinates of points in the open sphere G. Ft is customary to State 
this fact concisely by saying that the function (34) is defined on the 
sphere G. 

Let us give another example. The température of a nonuniformly 
heated body K is a function of the coordinates x, y, z of the points of the 
body. This function is not defined for ail sets of three numbers x, y, z 
but only for such sets as are coordinates of points of the body V. 

Finally, as a third example, let us consider the function 

« = <t>(x) + <f>(y) + 4>{z), 

where <f> is a function of one variable defined on the interval [0, 1 ]. Obvious¬ 
ly the function u is defined only for sets of three numbers ( x , y, z) which 
are coordinates of points in the cube: 


1, 0 < I.O^zîS 1. 


We now give a formai définition of a function of three variables. Suppose 
that we are given a set E of triples of numbers ( x , y, z) (points of space). 
If to each of these triples of numbers (points) of E there corresponds a 
definite number u in accordance with some law, then u is said to be a 
function of x, y, z (of the point), defined on the set of triples of numbers 
(on the points) E, a fact which is written thus: 

u = F\x, y, z). 

In place of F we may also write other letters: /, <f>, <fi. 
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In practice the set E wiü usually be a set of points, filling out some 
geometrical body or surface: sphere, cube, annulus, and so forth, and then 
we simply say that the function is defined on this body or surface. Functions 
of two, four, and so forth, variables are defined analogously. 

Implicit définition of a function. Let us note that functions of two 
variables may serve, under certain circumstances, as a useful means for the 
définition of functions of one variable. Given a function F(x, y) of two 
variables let us set up the équation 

r\x,y) = o. (36) 

In general, this équation will define a certain set of points (x, y) of the 
surface on which our function is equal to zéro. Such sets of points usually 
represent curves that may be considered as the graphs of one or several 
one-valued functions y = <f>(x) or x = ip(y) of one variable. In such a 
case these one-valued functions are said to be defined implicitly by the 
équation (36). For example, the équation 

Je* + f - r* = 0 


gives an implicit définition of two functions of one variable 

y = + y/r* — x 1 and y = — Vr* — x*. 

But it is necessary to keep in mind that an équation of the form (36) 
may fail to define any function at ail For example, the équation 

** + / + 1 = 0 

obviously does not define any real function, since no pair of real numbers 
satisfies it. 

Géométrie représentation. Functions of two variables may always 
be visualized as surfaces by means of a system of space coordinates. Thus 
the function 

* = /(*> y) (37) 

is represented in a three-dimensional rectangular coordinate system by a 
surface, which is the géométrie locus of points M whose coordinates 
x, y, z satisfy équation (37) (figure 26). 

There is another, extremely useful method, of representing the function 
(37), which has found wide application in practice. Let us choose a 
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sequence of numbers z, , z 2 , •••, and then draw on one and the same plane 
Oxy the curves 

= f(x, y), z 2 = /[x , y), 

which are the so-called level Unes 
of the function j\x, y). From a set 
of level Unes, if they correspond 
to values of z that are sufficiently 
close to one another, it is possible 
to form a very good opinion of the 
variation of the function f(x, y), 
just as from the level Unes of a 
topographical map one may judge 
the variation in altitude of the 
locality. 

Figure 27 shows a map of the 
level Unes of the function z = x* + y 2 , 
the diagram at the right indicating 
how the function is built up from Fig. 26. 

its level Unes. In Chapter 111, figure 

50, a similar map is drawn for the level Unes of the function z 

Partial dérivatives and differential. Let us make some remarks about 
the différentiation of the functions of several variables. As an example 
we take the arbitrary function 

z = Ax, y) 




Fig. 27. 
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of two variables. If we fix the value of y, that is if we consider it as not 
varying, then our function of two variables becomes a function of the 
one variable x. The dérivative of this function with respect to x, if it 
exists, is called the partial dérivative with respect to x and is denoted 
thus: 

dz v . 

Tx' or Tx' or Ux ' y) - 

The last of these three notations indicates clearly that the partial dériva¬ 
tive with respect to x is in general a function of x and y. The partial dériva¬ 
tive with respect to y is defined similarly. 

Geometrically the function f(x, y) represents a surface in a rectangular 

three-dimensional system of 
coordinates. The corresponding 
function of x for fixed y re¬ 
presents a plane curve (figure 
28) obtained from the inter¬ 
section of the surface with a 
plane parallel to the plane Oxz 
and at a distance y from it. 
The partial dérivative dz/dx is 
obviously equal to the trigono¬ 
métrie tangent of the angle 
between the tangent to the 
curve at the point (x, y) and 
the positive direction of the 
x-axis. 

More generally, if we con¬ 
sider a function z = /(x,, x 2 , •••, x„) of the n variables x,, x 2 , •••, x„ , 
the partial dérivative dz/dXi is defined as the dérivative of this function 
with respect to x, , calculated for fixed values of the other variables: 

*^1 » *2 » •••* ^«-l * *1 * •••» X n . 

We may say that the partial dérivative of a function with respect to the 
variable x, is the rate of change of this function in the direction of the 
change in x,. It would also be possible to define a dérivative in an arbitrary 
assigned direction, not necessarily coinciding with any of the coordinate 
axis, but we will not take the time to do this. 

Examples. 



Fio. 28. 


x dz 1 dz 


x 
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y/ X * + ÿ* + ** ’ 

du _ _ 1 _2x___ x _ 

dx ~ ~ x* + ÿ* + z * ' 2 Vx* + y 2 + z* _ _ (* 2 + v 2 4- z 2 ) 572 ' 

lt is sometimes necessary to form the partial dérivatives of these partia 
dérivatives; that is, the so-called partial dérivatives of second order. For 
functions of two variables there are four of them 

d 2 u d 2 u d 2 u d 2 u 

d^'dïdï'Wdï'Tf' 

However, if these dérivatives are continuous, then it is not hard to prove 
that the second and third of these four (the so-called mixed dérivatives) 
coincide: 

d 2 u d 1 u 
dx dy dy dx ' 

For example, in the case of first function considered, 

= 0 d* 7 = d * z = _ 1 *£ = 2x 

dx 2 ’ dx dy y 2 ’ dy dx y 2 ' dy 2 y 3 ' 

the two mixed dérivatives are seen to coincide. 

For functions of several variables, just as was done for functions of 
one variable, we may introduce the concept of a differential. 

For definiteness let us consider a function 

x =Ax,y) 

of two variables. If it has continuous partial dérivatives, we can prove 
that its incrément 

Az = f(x + Ax, y + Ay) -/(*, y), 

corresponding to the incréments Ax and Ay of its arguments, may be 
put in the form 

Az = j^Ax + j^Ay + « VAx 2 + A y 2 , 

where dfldx and dfjdy are the partial dérivatives of the function at the 
point (x, y) and the magnitude a dépends on Ax and Ay in such a way 
that ot —► 0 as Ax —» 0 and Ay -* 0. 
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The sum of the first two components 

dz = ^Ax + ?j-Ay 
ox dy 

is linearly dépendent* on Ax and A y and is called the dijferential of the 
function. The third summand, because of the presence of the factor a, 
tending to zéro with Ax and Ay, is an infinitésimal of higher order than 
the magnitude 

p = VAx* + A y 2 , 
describing the change in x and y. 

Let us give an application of the concept of differential. The period 
of oscillation of a pendulum is calculated from the formula 

r = 2 ”Vi- 


where / is its length and g is the accélération of gravity. Let us suppose 
that / and g are known with errors respectively equal to Al and Ag. Then 
the error in the calculation of T will be equal to the incrément AT corre- 
sponding to the incréments of the arguments Al and Ag. Replacing AT 
approximately by dT , we will hâve 


AT * dT = n l-^L 
1 Vlg 


VI Ag \ 
Vg> ’ 


The signs of Al and Ag are unknown, but we may obviously estimate 
AT by the inequality 

from which after division by T we get 

\AT\ i \ Al \ \Ag\ v 

t \ i g r 


Thus we may consider in practice that the relative error for T is equal 
to the sum of the relative errors for / and g. 


* In general a function Ax + By + C, where A, B, C are constants, is called a 
linear function of x and y. If C = 0, it is called a homogeneous linear function. Here 
we omit the word "homogeneous.” 
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For symmetry of notation, the incréments of the independent variables 
Ax and Ay are usually denoted by the symbols dx and dy and are also 
called diflerentials. With this notation the differential of the function 
u = J{x, y, z) may be written thus: 


du = jfdx + dy + dz. 

dx dy dz 


Partial dérivatives play a large rôle whenever we hâve to do with fonc¬ 
tions of several variables, as happens in many of the applications of 
analysis to technology and physics. We shall be dealing in Chapter VI 
with the problem of reconstructing a function from the properties of its 
partial dérivatives. 

In the following paragraphs, we give some simple examples of applica¬ 
tions of partial dérivatives in analysis. 


Différentiation of implicit functions. Suppose we wish to find the 
dérivative of y, where y is a function of x defined implicitly by the relation 

F{x, y) = 0 (38) 

between these variables. If x and y satisfy the relation (38) and we give 
x the incrément Ax, then y will receive an incrément Ay such that x + Ax 
and y + Ay again satisfy (38). Consequently* 

F(x + Ax,y + Ay) - F(x,y) = ^Ax + ^Ay + a VAx* + A~y* = 0. 

dx dv 

Thus, provided dF/dy # 0, it follows that 

d_F 
dx_ 
dF' 
dy 

ln this way we hâve obtained a method for finding the dérivative of an 
implicit function y without first solving the équation (38) for y. 


.. Ay 

lim -r- = y, = — 

Jr-* AX 


Maximum and minimum problems. If a function, let us say of two 
variables z = J{x, y), attains its maximum at the point (x 0 , y 0 ), that is if 
A x o < .Vo) ^ /(*» ÿ) f° r a** points ( x , y) close to (x 0 , y 0 ), then this point 
must also be the point of maximum altitude for any line formed by the 


* We assume that F(x, y) has continuous dérivatives with respect to x and y. 
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intersection of the surface z = f{x,y) with a plane parallel to Oxz or 
Oyz. So at such a point we must hâve 


fx(x, y) = 0,/„'(*, y) = 0. (39) 

The same équations must also hold for a point of local minimum. Con- 
sequently, the greatest or least values of the function are to be sought 
first of ail at points where the conditions (39) are satisfied, but we must 
also not forget about points on the boundary of the domain of définition 
of the function and points where the function fails to hâve a dérivative, 
if such points exist. 

To establish whether a point (*, y) satisfying (39) is actually a maximum 
or minimum point, use is frequently made of various indirect arguments. 
For example, if for any reason it is clear that the function is différentiable 
and attains its minimum inside the région and that there is only one point 
where the conditions (39) are fulfilled, then obviously the minimum must 
be attained at this point. 

For example, let it be required to make a rectangular tin box (without 
lid) with assigned volume V, using the smallest possible amount of material. 
If the sides of the base of this box are denoted by x and y, then its height 
h will be equal to V/xy, and consequently the surface S will be given by the 
function 

S = xy + ~ y (2x + 2y) = xy + 2V^- + ^ (40) 

of x and y. Since x and y by the terms of the problem must be positive, 
the question has been reduced to finding the minimum of the function 
S(x, y) for ail possible points (x, y) in the first quadrant of the plane 
(*, y), which we will dénoté by the letter G. 

If the minimum is attained at some point of the région G, then the partial 
dérivatives must be equal to zéro 





that is yx 2 = 2K, xy 2 = 2V, from which we find as the dimensions of the 
box: 


x = y = V2V and 



(41) 


We hâve solved the problem but hâve not altogether proved that our 
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solution is correct. A rigorous mathematician will say to us: “You hâve 
supposed from the very beginning that under the given conditions the 
box with minimum surface actually exists and, proceeding from this 
assumption, you hâve found its dimensions. So you hâve really obtained 
only the following resuit: If there exists a point ( x, y) in G for which the 
function 5 attains its minimum, then the coordinates of this point must 
necessarily be determined by the équation (41). But now you must show 
that the minimum of S does exist for some point in G and then / will 
admit the correctness of your resuit.” This remark is a very reasonable 
one, since, for example, our 
function S, as we shall soon see, 
does not possess any maximum 
in the région G. But let us show 
how it is possible to convince 
ourselves that in the given case 
the function actually does attain 
its minimum at a certain point 
(x,y) of the région G. 

The fundamental theorem on 
which we shall base our argument 
is one that is proved in analysis 
with complété rigor; it amounts 
to the following. If the function 
/ of one or several variables is 
every where continuous in a certain Fig. 29. 

finite région H which is bounded 

and includes its boundary, then there always exists in H at least one point 
at which the function attains its minimum (maximum). With this theorem 
we can easily complété our analysis of the problem. 

Let us consider an arbitrary point (* 0 , y 0 ) of the région G; at this point 
let S(at 0 , y 0 ) — N. Let us also choose a number R satisfying the two 
inequalities R > N, 2VR > N and construct a square Q K with side R 2 , 
as in figure 29, where AB = CD = l/R. 

We now give a lower bound for the values of our function S(jr, y) at 
points of the région G lying outside the square Q R . If the point of the 
région G has abscissa x < l/R, then 

S(x,y) = xy + 2V^- + I) > 2V l - > 2VR > N. 

Analogously, if the point of the région G has its ordinate y < l/R, then 
also S > N. Also, if the point of the région G has its abscissa x > l/R 
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and if it lies above the straight line AF or bas its ordinate y > 1 /R and 
lies to the right of the straight line CE, then 

•S(x, y) > xy > R 2 = R > N. 

Thus, for ail points ( x , y) of the région G lying outside the square I2 k , 
the inequality S(x, y) > N holds, and since S(jr 0 , y 0 ) = N, the point 
(* 0 , y <,) must belong to the square and consequently the minimum of our 
function on G is equal to its minimum on the square. 

But the function S(x, y) is continuous in this square and on its boundary, 
so that by the theorem stated earlier there exists in the square a point 
(x, y) where our function assumes its minimum for points in the square 
and consequently for the entire région G. Thus the existence of a minimum 
has been proved. 

This argument may serve as an example of the way that it is possible 
to discuss the existence of a maximum or a minimum for a function 
defined on an unbounded domain. 

The Taylor formula. Like functions of one variable, functions of 
several variables may be represented by a Taylor formula. For example, 
an expansion of the function 

« =A*,y) 

in the neighborhood of the point (x 0 ,y 0 ) has the following form, if we 
confine ourselves to the first and second powers of x — x 0 and y — y 0 : 


f(x, y) = f(x 0 , y 0 ) + [fÿ.x, «, y 0 )(x - x 0 ) + f‘,(x a , y 0 Xy - ^o)] 

+ Y\ f/^o • YoXx - x 0 Y + 2/"(at 0 , y 0 )(* - x 0 )(y - y 0 ) 

+ fy'y(x 0 , y 0 Xy - Vo) 2 ] + «3 • 

If the function f(x, y) has continuous partial dérivatives of the second 
order, the remainder term here will approach zéro faster than 

r 2 = (x — x 0 f + (y - y o y. 


that is, faster than the square of the distance between the points (x, y) 
and ( x 0 , y 0 ), as r -► 0. The Taylor formula provides a widely used method 
of defining and approximately calculating the values of various functions. 

Let us note that with the help of this formula we can also answer the 
question asked earlier, whether a given function actually has a maximum 
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or minimum at a point where dftdx = df/dy = 0. In fact, if these condi¬ 
tions are satisfied at the point (jr 0 , y 0 ), then for points ( x , y) close to 
(* 0 . ^o). the value of the function will, by the Taylor formula, differ from 
f(xo, y 0 ) by the amount 

/O. y) - A* o. y 0 ) 

= ^ [A(x - x o y + 2B(x - x 0 )(y - y 0 ) + C(y - y 0 ) 2 ] + R 3 , (42) 

where A, B, and C dénoté respectively the second partial dérivatives 
fxx Jxy at the point (x 0 , y 0 ). 

If it turns out that the function 

<P(x, y) = A(x - x 0 ) 2 -(- 2 B(x - x 0 Xy - y 0 ) -f C(y - y 0 ) 2 

is positive for arbitrary values of (at — *„) and (y — y 0 ) not both equal 
to zéro, then the right side of équation (42) will also be positive for small 
values of ( x — x 0 ) and (y — j> 0 ), since for sufficiently small ( x — x 0 ) 
and ( y — y 0 ) the quantity R 3 is known to be less in absolute value than 
£#>(*, y). Thus it will follow that at the point (x 0 , y 0 ) the function f 
attains its minimum. On the other hand, if the function &(x, y) is négative 
for arbitrary (x — x„) and (y — y 0 ) the right side of (42) will be négative 
for ( x — x 0 ) and (y — y 0 ), so that at the point (* 0 , y 0 ) the function will 
hâve a maximum. In more complicated cases it is necessary to consider 
the succeeding terms in the Taylor formula. 

Problems concerning the maximum or the minimum of functions of 
three or more variables may be treated in a completely analogous fashion. 
As an exercise the reader may prove that if given masses 

w,, w 2 , —, m„ 

are arranged in space at given points 


P\{x i, y t , Z|), P^x 2 , y 2 , z 2 ), -, P n (x„ , y n , z„), 

the moment (of inertia) M of this System of masses about the point 
P(x, y, z), defined as the sum of the products of the masses and the 
squares of their distances from the point P. 


M(x, y, z) = ^ m,[(x - x,) 2 + (y - y,) 2 + (z - z,) 2 ], 

i-l 
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will be a minimum if the point P is at the so-called center of gravity of 
the System, with the coordinates 


2?_i _ 2,"_| m,y, ^ w,z, 

2 ?-, >y ~ 2?., 2?.i m > 


Maxima and minima with subsidiary conditions. For functions of 
several variables we may set up various problems concerning maximum 
and minimum. Let us illustrate with a simple example. Suppose that 
among ail rectangles inscribed in a circle of radius R, we wish to find the 
one with greatest area. The area of a rectangle is equal to the product 
xy of its sides, where x and y are positive numbers connected in this case 
by the relation x 2 + y 2 = (2 R) 2 , as is clear from figure 30. Thus we are 
required to find the maximum of the function 
f(x, y) = xy for ail x and y satisfying the relation 
x 2 + y 2 = 4 R 2 . 

Problems of this sort, where it is necessary to 
find the maximum (or minimum) of a function 
f(x, y) for those values only of x and y that 
satisfy a certain relation that <f>(x, y) = 0 are very 
common in practice. 

Of course, it would be possible to solve the 
équation #*, y) = 0 for y, to substitute the solu¬ 
tion into the function f(x, y) and in this way to 
seek the ordinary maximum for a function of one variable x. But this 
method is usually complicated and sometimes impossible. 

For the solution of such problems in analysis, a much more convenient 
procedure called the method of Lagrange multipliers, has been worked 
out. The idea behind it is extremely simple. Let us consider the function 

f\x, y) = f(x, y) + A#*, y). 



Fig. 30. 


where A is an arbitrary positive number. Obviously, for x, y satisfying 
the condition <f>(x, y) = 0, the values of F(x, y) coincide with those of /{x, y). 

For function F(x,y) let us seek a maximum without conditions of any 
kind on x and y. At the maximum point the conditions 8F/dx= dF/dy = 0* 
must hold; in other words 


0/, A ^_o- 

dï + A Tx~ 0 ’ 
dy dy 


(43) 


(44) 


* We are speaking here, of course, of a maximum attained in lhe domain of définition 
of the function F{x, y). The functions f(x, y) and 4>(x, y) are assumed to be différentiable. 
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The values of x and y at the maximum point for F(x, y), being a solution 
of the System (43) and (44), dépend on the coefficient À in these équations. 
Let us now suppose that we hâve succeeded in choosing the number A 
in such a way that the coordinates of the maximum point satisfy the 
condition 

<Hx,ÿ) = 0. (45) 

Then this point will be an exact local maximum for the original problem. 

In fact, we may consider the problem geometrically as follows. The 
function f(x, y) is defined on a 
certain région G (figure 31). The 
condition y) = 0 will ordi- 
narily be satisfied by the points of 
some curve T. We are required to 
find the greatest value of x and y 
on points of the line R If F\x, y) 
attains its maximum on the curve 
r, then F(x, y) does not increase 
for small shifts in an arbitrary 
direction from this point, and in 
particular for shifts along the 
curve r. But for shifts along T 1 , 
the values of F(x, y), coincide 
with those of f (n,y) which means that for small shifts along the curve the 
function f(x, y) does not increase, or in other words it has the local 
maximum at the point. 

These arguments indicate a simple method of solving the problem. We 
solve équations (43), (44), (45) for the unknowns x,y, and À, obtaining 
one or more solutions 



(*i . y \. A,), (x 2 , y 2 , A 2 ), — . (46) 

To the points (*!, y t ), (x 2 , y 2 ), ••• so determined we adjoin those 
points of the boundary of G where the curve r leaves the région G. Then 
from ail these points we choose that one at which /(x, y) takes on its 
greatest (or smallest) value. 

Of course, the arguments here are far from proving the correctness of 
the method. In fact, we hâve not yet even proved that the points of local 
maximum for f(x, ÿ) on the curve r can be obtained as maximum points 
for the function F(x, y) for some value of A. However, it is possible to 
prove, as is done in the textbooks in analysis, that every point ( x 0 , y 0 ) 
where /(x, y) has a local maximum on the curve will be obtained by the 
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method indicated, provided only that at this point the partial dérivatives 
> To) a nd <t>',(x» » To) are not both equal to zéro.* 

Let us use the method of Lagrange to solve the problem at the beginning 
of the présent section. In this case f(x, y) — xy ; <f>(x, y) = x 2 + y 2 — 4 R 2 . 
We set up the équations (43), (44), (45) 

y + 2Ajc = 0, 
a: + 2Xy = 0, 
x 2 +y 2 = 4 R 2 , 

for which, taking into account that x and y are positive, we find the unique 
solution 

x = y= Ry/Î(\= - I). 

For these values of x and y, which are equal to one another so that the 
inscribed rectangle is a square, the area is in fact a maximum. 

The method of Lagrange may be extended to deal with functions of 
three or more variables. There may be any number of subsidiary conditions 
(smaller than the number of variables) of the type of condition (45), and 
we will introduce the corresponding number of auxiliary multipliers. 

Let us give some examples of problems involving maxima or minima 
with subsidiary conditions. 

Example 1. For what height h and radius r will an open cylindrical 
tank of given volume V require the least amount of sheet métal for its 
manufacture; that is, the area of its sides and circular base will be a 
minimum ? 

The problem obviously reduces to finding the minimum of the function 
of the variables r and h 


f(r, h) = 2nrh + nr 2 

under the condition irr 2 h = V, which may be written in the form 
<f>(r, h) = rrr 2 h — V = 0. 


* In the course in higher mathematics of V. I. Smirnov, the reader will find a simple 
example where this particular feature of the situation would lead to the loss of a solution 
if we apply the method of Lagrange mechanically and do not consider, in addition to 
the points mentioned above, a point where not only (45) holds but also: 

, y>) = 0, f,(x a , y.) = 0. 
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Example 2. A moving point is required to pass from A to B (figure 32). 


On the path AM it moves with the 
velocity of y,, and on MB with the 
velocity v 2 . Where should the point 
M be placed on the line DD' so that 
the entire path from A to B may be 
covered as quickly as possible? 

Let us take as unknowns the angles 
a and (8 marked in figure 32. The 
lengths a and b of the perpendiculars 
from the points A and B to the 
straight line DD' and the distance c 
between them are known. The time 
required for covering the entire path 
by the formula 


A 



represented as can easily be seen, 


/(«,/ 3 )=- + 

l’, COS a V 2 COS p 


lt is required to find the minimum of this expression, taking into account 
the fact that a and /3 are connected by the relation 

a tan a + b tan /3 = c. 


The reader may solve these examples by the Lagrange method. In the 
second example he will find that the best position for M is given by the 
condition 

sin a _ y, 
sin p v 2 ' 


This is the well-known law for the refraction of light. Consequently, a ray 
of light will be refracted in its passage from one medium to another in 
such a way that the time for its passage from a point in one medium 
to a point in the other is a minimum. Conclusions of this sort are interesting 
not only for compuiational purposes but also from a general philosophical 
point of view; they hâve inspired researchers in the exact sciences to 
penetrate further and further into the profound and general laws of 
nature. 

Finally let us note that the multipliers A, introduced in the solution of 
problems by the method of Lagrange, are not merely auxiliary numbers. 
In each case they are closely connected with the essential nature of the 
particular problem and hâve a concrète interprétation. 
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§13. Generalizations of (lie Concept of Intégral 

In §10 we defined the definite intégral of the function f(x) on the interval 
[a, b] as the limit of the sum 


'ZMdàx, 


when the length of the greatest segment Ax, in the subdivision of [a, b] 
approaches zéro. In spite of the fact that the class of functions f(x) for 
which this limit actually exists (the class of intégrable functions) is a 
very wide one, and in particular includes ail continuous and even many 
discontinuous functions, this class of functions has a serious shortcoming. 
If we add, subtract, or multiply, or under certain conditions divide the 
values of two intégrable functions f(x) and <f>(x), we obtain functions 
which, as may easily be proved, are again intégrable. For f(x)/<f>(x) this 
will be true in ail cases in which 1 /$*) remains bounded on [a, b). But 
if a function is obtained as a resuit of a limiting process from a sequence 
of approximating intégrable functions /,(x),/ 2 (x),/ 3 (x), ••• such that for 
ail values of x in the interval [a, Z>] 


/(*) = lim/„(x). 


then the limit function /(x) is not necessarily intégrable. 

In many cases this and other circumstances give rise to considérable 
complication, since the process of passing to a limit is widely used. 

A way out of the difficulty was discovered by making further generaliza- 
tions of the concept of an intégral. The most important of these is the 
intégral of Lebesgue, with which the reader will become acquainted in 
Chapter XV on the theory of functions of a real variable. But here we 
will confine ourselves to generalizations of the intégral in other directions, 
which are also of the greatest importance in practice. 

Multiple intégrais. We hâve already studied the process of intégration 
for functions of one variable defined on a one-dimensional région, namely 
an interval. But the analogous process may be extended to functions of 
two, three, or more variables, defined on corresponding régions. 

For example, let us consider a surface 


z =Ax,y) 
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defined in a rectangular System of coordinates, and on the plane Oxy let 
there be given a région G bounded 
by a closed curve T. It is required to 
find the volume bounded by the 
surface, by the plane Oxy and by the 
cylindrical surface passing through 
the curve T with generators parallel 
to the Oz axis (figure 33). To solve 
this problem we divide the plane 
région G into subregions by a net- 
work of straight fines parallel to the 
axes Ox and Oy and dénoté by 



G, , G 2 , -, G„ 


Fio. 33. 


those subregions which consist of 

complété rectangles. If the net is sufficiently fine, then practically the 
whole of the région G will be covered by the enumerated rectangles. In 
each of them we choose at will a point 


(fl (fn . Vn) 


and, assuming for simplicity that G, dénotés not only the rectangle but 
also its area, we set up the sum 


S„ = /(fl , Vi) Gi +/(f 2 , Vi) G 2 + "• +/(fn , Vn) G„ = ^/(ff , Vt) G. • 

(47) 


It is clear that, if the surface is continuous and the net is sufficiently 
fine, this sum may be brought as near as we like to the desired volume V. 
We will obtain the desired volume exactly if we take the limit of the sum 
(47) for finer and finer subdivisions (that is, for subdivisions such that 
the greatest of the diagonals of our rectangles approaches zéro) 


lim 

max d(G, >-o 


’Zfitt.Vi) G, 
<=•1 


V. 


(48) 


From the point of view of analysis it is therefore necessary, in order to 
détermine the volume V, to carry out a certain mathematical operation 
on the function f(x,y ) and its domain of définition G, an operation 
indicated by the left side of equality (48). This operation is called the 
intégration of the function/over the région G, and its resuit is the intégral 
of / over G. It is customary to dénoté this resuit in the following way: 

ff/(*> y) dx dy = lim i)/(f, , *?,) G,. 

JJ m»x d(C,Mi ^ 


(49) 
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Similarly, we may define the intégral of a function of three variables 
over a three-dimensional région G, representing a certain body in space. 
Again we divide the région G into parts, this time by planes parallel to 
the coordinate planes. Among these parts we choose the ones which 
represent complété parallelepipeds and enumerate them 

G,, G 2 , —, G„ . 

In each of these we choose an arbitrary point 

(fl • ’Il > fl)> (fî > Vi > £2)1 (fn > Vn > Cn) 
and set up the sum 


5 = £/(f,. Vi . Ci) G,. (50) 

1-1 

where G, dénotés the volume of the parallelepiped G,. Finally we define 
the intégral of /(*, y, z ) over the région G as the limit 

nJîîS.MO % Kè ‘ ' V < ’ W G< = /// f(X ' y ' Z) dX ^ dZ ’ (5I) 

to which the sum (50) tends when the greatest diagonal d(G,) approaches 
zéro. 

Let us consider an example. We imagine the région G is filled with a 
nonhomogeneous mass whose density at each point in G is given by a 
known function p(x, y , z). The density p(x, y, z) of the mass at the point 
( x , y, z) is defined as the limit approached by the ratio of the mass of an 
arbitrary small région containing the point (x , y, z) to the volume of 
this région as its diameter approaches zéro.* To détermine the mass of 
the body G it is natural to proceed as follows. We divide the région G 
into parts by planes parallel to the coordinate planes and enumerate 
the complété parallelepipeds formed in this way 

G,, G 2 , —, G„ . 

Assuming that the dividing planes are sufficiently close to one another, 
we will make only a small error if we neglect the irregular régions of the 
body and define the mass of each of the regular régions G, (the complété 
parallelepipeds) as the product 

p(èi, Vi . C,)Gi, 


* The diameter of a région is defined as the least upper bound of the distance between 
two points of the région. 
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where (f, , 77 , , £ ( ) is an arbitrary point G,. As a resuit the approximate 
value of the mass M will be expressed by the sum 

= X M* ’ * ’ W G * > 

i-l 

and its exact value will clearly be the limit of this sum as the greatest 
diagonal G ( approaches zéro; that is 

M = f f f p(x, y, z)dx dy dz = lim V p(Ç,, tj ,, £ ( ) G,. 

J JJ m lx dlC.M 

C 

The intégrais (49) and (51) are called double and triple intégrais 
respectively. 

Let us examine a problem which leads to a double intégral. We imagine 
that water is flowing over a plane surface. Also, on this surface the 
underground water is seeping 
through (or soaking back into 
the ground) with an intensity 
f(x,y) which is different at 
different points. We consider a 
région G bounded by a closed 
contour (figure 34) and assume 
that at every point of G we 
know the intensity f(x, y), na- 
mely the amount of under¬ 
ground water seeping through 
per minute per cm 2 of surface; Fig. 34. 

we will have/(.x, y) > 0 where 

the water is seeping through and f(x, y) < 0 where it is soaking into the 
ground. How much water will accumulate on the surface G per minute? 

If we divide G into small parts, consider the rate of seepage as approxi- 
mately constant in each part and then pass to the limit for finer and 
finer subdivisions, we will obtain an expression for the whole amount 
of accumulated water in the form of an intégral 

j j f(x, y) dx dy. 

G 

Double (two-fold) intégrais were first introduced by Euler. Multiple 
intégrais form an instrument which is used everyday in calculations and 
investigations of the most varied kind. 
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lt would also be possible to show, though we will not do it here, that 
calculation of multiple intégrais may be reduced, as a rule, to iterated 
calculation of ordinary one-dimensional intégrais. 

Contour and surface intégrais. Finally, we must mention that still 
other generalizations of the intégral are possible. For example, the problem 
of defining the work done by a variable force applied to a material point, 
as the latter moves along a given curve, naturally leads to a so-called 
curvilinear intégral, and the problem of finding the general charge on a 
surface on which electricity is continuously distributed with a given surface 
density leads to another new concept, an intégral over a curved surface. 

For example, sup¬ 
pose that a liquid 
is flow'ing through 
space (figure 35) and 
that the velocity of 
a particle of the 
liquid at the point 
(x, y) is given by a 
function P(x, y), not 
depending on z. If 
we wish to deter- 
mine the amount of 
liquid flowing per 
Fig. 35. minute through the 

contour F,* we may 

reason in the following way. Let us divide F up into segments As,. 
The amount of water flowing through one segment As, is approximately 
equal to the column of liquid shaded in figure 35; this column may be 
considered as the amount of liquid forcing its way per minute through 
that segment of the contour. But the area of the shaded parallelogram 
is equal to 

P,(x, y) • As, • cos a ,, 

where a, is the angle between the direction x of the x-axis and the out- 
ward normal of the surface bounded by the contour F; this normal is 
the perpendicular n to the tangent, which we may consider as defining 
the direction of the segment As,. By summing up the areas of such 
parallelograms and passing to the limit for finer and finer subdivisions 

* More precisely, through a cylindrical surface with the contour for its base and 
with height equal to unity. 
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of the contour T, we détermine the amount of water flowing per minute 
through the contour T; it is denoted thus: 

I P(x, y) cos («, x) ds 
J r 

and is called a curvilinear intégral. If the flow is not everywhere parallel, 
then its velocity at each point (x, y) will hâve a component P(x, ÿ) along 
the x-axis and a component Q(x, y) along the y-axis. In this case we can 
show by an analogous argument that the quantity of water flowing through 
the contour will be equal to 

[ ÿ) cos ( n , x) + Q(x, y) cos (n, ÿ)] ds. • 

J r 

When we speak of an intégral over a curved surface G for a function 
f(M) of its points M(x, y, z), we mean the limit of sums of the form 


lim Yjf(Mi) Ja, = JJ f(x, y, z) do 


for finer and finer subdivisions of the région G into segments whose areas 
are equal to J a,. 

General methods exist for transforming multiple, curvilinear, and 
surface intégrais into other forms and for calculating their values, either 
exactly or approximately. 


Formula of Ostrogradskii. Several important and very general formulas 
relating an intégral over a volume to an intégral over its surface (and 
also an intégral over a surface, curved or plane, to an intégral around its 
boundary) were discovered in the middle of the past century by 
Ostrogradskii. 

We shall not try to give here a proof of the general formula of 
Ostrogradskii, which has very wide application, but will merely illustrate 
it by an example of its simplest particular case. 

Let us imagine, as we did before, that over a plane surface there is a 
horizontal flow of water that is also soaking into the ground or seeping 
out again from it. We mark off a région G, bounded by a curve T, and 


* Since for small displacements along the curve the differential of the coordinate y 
is equal to cos (n,g)ds and the differential dx is equal to —cos(fl, ÿ) ds, this latter 
intégral is often written in the form 

f [P(x.y)dy - Q(x, y)dx). 

J r 
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assume that for each point of the région we know the components P(x , y) 
and Q(x, y) of the velocity of the water in the direction of the x-axis 
and of the y-axis respectively. 

Let us calculate the rate at which the water is seeping from the ground 
at a point with coordinates ( x , y). For this purpose we consider a small 
rectangle with sides Ax and Ay situated at the point ( x , y). 

As a resuit of the velocity P(x, y) through the left vertical edge of this 
rectangle, there will flow approximately P(x, ÿ)Ay units of water per 
minute into the rectangle, and through the right side in the same time 
will flow out approximately P(x + Ax, y)Ay units. In general, the net 
amount of water leaving a square unit of surface as a resuit of the flow 
through its left and right vertical sides will be approximately 

[P(x + Ax,y)-P(x,y)]Ay 
AxAy 

lf we let Ax approach zéro, we obtain in the limit 

dP_ 
dx ' 

Correspondingly, the net rate of flow of water per unit area in the direction 
of the y-axis will be given by 

dQ 
dy ' 

This means that the intensity of the seepage of ground water at the point 
with coordinates (x, y) will be equal to 

dP dQ 
dx + dy ' 

But in general, as we saw earlier, the quantity of water coming out 
from the ground will be given by the double intégral of the function 
expressing the intensity of the seepage of ground water at each point, 
namely 

//(-£+-£)** <® 

G 

But, since the water is incompressible, this entire quantity must flow out 
during the same time through the boundaries of the contour R The 
quantity of water flowing out through the contour r is expressed, as we 
saw earlier, by the curvilinear intégral over r 

f [/>(*, y) cos ( n , x) + Q(x, y) cos («, ÿ)] ds. 

J r 


(53) 
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The equality of the magnitudes (52) and (53) expresses the formula of 
Ostrogradskil in its simplest two-dimensional case 

11 (l£r + ~ày) dx dy = l r ^ x ' cos ^ cos (”- T)] ds. 

G 

We hâve merely explained the meaning of this formula by a physical 
example, but it can be proved mathematically. 

In this way the mathematical theorem of Ostrogradskil reflects a 
widespread phenomenon in the external world, which in our example we 
interpreted in a readily visualized way as préservation of the volume of 
an incompressible fluid. 

Ostrogradskil established a considerably more general formula expressing 
the connection between an intégral over a multidimensional volume and 
an intégral over its surface. In particular, for a three-dimensional body G, 
bounded by the surface T, his formula is 

Sü /(-&+-£+-£)*** 

G 

= || [P cos (n, x) + Q cos (fi, ÿ) + R cos (n, z)] do, 
r 

where do is the element of surface. 

It is interesting to note that the fundamental formula of the intégral 
calculus 

Ç f(x) dx = F(b) - F(a) (54) 

may be considered as a one-dimensional case of the formula of 
Ostrogradskil. The équation (54) connects the intégral over an interval 
with the “intégral” over its “null-dimensional” boundary, consisting of 
the two end points. 

Formula (54) may be illustrated by the following analogy. Let us 
imagine that in a straight pipe with constant cross section s = 1 water 
is flowing with velocity F(x), which is different for different cross sections 
(figure 36). Through the porous walls of the pipe, water is seeping into it 


\ * / 



à /V, b 


X X *&x 


Fig. 36. 
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(or out of it) at a rate which is also different for different cross sections. 

If we consider a segment of the pipe from x to x + Ax, the quantity 
of water seeping into it in unit time must be compensated by the différence 
F(x + Ax) — F(x) between the quantity flowing out of this segment 
and the quantity flowing into it along the pipe. So the quantity seeping 
into the segment is equal to the différence F(x + Ax) — F(x), and 
consequently the rate of seepage per unit length of pipe (the ratio of the 
seepage over an infinitésimal segment to the length of the segment) will 
be equal to 

F(x + Ax) - F(x) 


f(x) = lim 

Az-*0 


Ax 


= F\x). 


More generally, the quantity of water seeping into the pipe over the 
whole section [a, b] must be equal to the amount lost by flow through 
the ends of the pipe. But the amount seeping through the walls is equal 
t0 Jt/(*) àx and the amount lost by flow through the ends is F(b) — F\a). 
The equality of these two magnitudes produces formula (54). 


§14. Sériés 

Concept of a sériés. A sériés in mathematics is an expression of the 
form 

«0 + «1 + «2 + 

The numbers u k are called the terms of the sériés. There is an infinité 
number of them, and they are arranged in a definite order, so that to 
each natural number k = 0, 1, 2, — there corresponds a definite value u k . 

The reader must keep in mind that we hâve not said whether it is 
possible to calculate a value for such expressions or, in case it is possible, 
how to do it. The presence of a plus sign between the terms u k in our 
expression seems to indicate that in some way ail the terms should be 
added. But there are infinitely many of them and addition of numbers is 
defined only for a finite number of terms. 

Let us dénoté by S n the sum of the first n terms of the sériés; we will 
call it the nth partial sum. As a resuit we obtain a sequence of numbers 

S t = u 0 , 

= m 0 + u, , 


Sn — u 0 + w, + "• + , , 


and we may speak of a variable quantity S n , where n = 1,2, — . 
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The sériés is said to be convergent if, as n -*• co, the variable S„ 
approaches a definite finite limit 

lim S n = S. 

n -ko 

This limit is called the sum of the sériés, and in this case we write 
S = «o + u, + u 2 + 

But if, as n -* co, the limit S n does not exist, then the sériés is said to be 
divergent and in this case there is no sense in speaking of its sum.* But 
if ail the u n hâve the same sign, then it is customary to say that the sum 
of the sériés is equal to infinity with the corresponding sign. 

As an example, let us consider the sériés 

1 + x + x 2 + —, 

whose terms form a géométrie progression with common ratio x. 

The sum of the first n terms is equal to 

$,(*) = D; (55) 

if | x | < 1 this sum has a limit 

lim S„(x) = —— , 

n-K» | — x 

and so for | x \ < 1 we may write 

T=7 = ‘ +* + *’ + "•• 

If | x | >1, then obviously 

lim S n (x) = co, 

n-*co 

and the sériés diverges. The same situation holds also for x = 1, as may 
be seen immediately without use of formula (55), which for x = 1 has 
no meaning. Finally, if x = —1 the partial sums take the values +1 
and 0 alternately, so that this sériés also is divergent. 

* Let us note that it is also possible to give generalized définitions of the sum of 
a sériés, by virtue of which it is possible to assign to certain divergent sériés a more or 
less natural concept of “generalized sum.” Such sériés are said to be summable. 
Operations with generalized sums of divergent sériés are sometimes useful. 
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To each sériés there corresponds a definite sequence of values of its 
partial sums S,, S 2 , S 3 , ••• such that the convergence of the sériés 
dépends on the fact that the sums approach a limit. Conversely, to an 
arbitrary sequence of numbers S, , S 2 , S 3 , ••• corresponds a sériés 

■Si + (S 2 — S,) -F (S 3 — S 2 ) + ••*, 


the partial sums of which will be the numbers of the sequence. Thus the 
theory of variables ranging over a sequence may be reduced to the theory 
of the corresponding sériés, and conversely. Yet each of these théories 
has independent significance. In some cases it is more convenient to study 
the variable directly and in others to consider the équivalent sériés. 

Let us note that sériés hâve long served as an important method of 
representing various entities (above ail, functions) and of calculating 
their value. Of course, the views of mathematicians concerning sériés 
hâve changed with the passage of time, corresponding to the changes in 
their ideas about infinitesimals. The above clear-cut définition of con¬ 
vergence and divergence of a sériés was formulated at the beginning 
of the last century at the same time as the closely associated concept of 
a limit. 

If the sériés converges, then its general term approaches zéro with 
increasing n, since 

lim u„ = lim — S„) = 5 - 5 = 0. 

n -* oo n -ko 

From examples given in the following paragraphs, it will be clear that 
the converse statement is in general false. But the criterion is still a 
useful one, since it provides a necessary condition for the convergence 
of a sériés. For example, the divergence of a géométrie progression with 
common ratio x > 1 follows immediately from the fact its general term 
does not approach zéro. 

If the sériés consists of positive terms, then its partial sum S n increases 
with increasing n and only two cases can exist: Either the variable S n 
becomes and remains greater than any preassigned number A for suffi- 
ciently large n, in which case lim,^. S n = oo, so that the sériés diverges; 
or else there exists a number A such that for ail n the value of S„ does 
not exceed A ; but then the variable S n necessarily approaches a definite 
finite limit not greater than A and the sériés is convergent. 

Convergence of a sériés. The question whether a given sériés con¬ 
verges or diverges may often be settled by comparing it with another 
sériés. Here it is customary to make use of the following criterion. 
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lf we are given two sériés 

u 0 + u t + u 2 + •••, 

J’o + ‘’i + v z + "• 

with positive terms such that for ail values of n, beginning with a certain 
one, we hâve the inequality 

u n < v„, 

then the convergence of the second sériés implies the convergence of 
the first, and the divergence of the first implies the divergence of the 
second. 

For example, let us consider the so-called harmonie sériés 


1 + 5 + 3 + * + 3 + 6 + ï + S + 


lts terms are correspondingly not less than the terms of the sériés 

1 + 5 + î + * + ë + î + § + § + + ’ 

8 times 

in which the sum of the underlined terms in each case is equal to £. 

It is clear that the sum S„ of the second sériés approaches infinity 
with increasing n, and consequently that the harmonie sériés diverges. 
The sériés 

1 + 1“ + 3“ + i 5 + ’ (56) 

where a is a positive number less than unity, also obviously diverges, 
since for arbitrary n 

±>I(0<.<». 


On the other hand, it is possible to prove that sériés (56) for a > 1 is 
convergent. We will prove this here only for the case a > 2; for this 
purpose we consider the sériés 





with positive terms. It converges to unity as its sum, since its partial 
sums S n are equal to 


S n = I — 


TTT^ 1( "- 0O) - 
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On the other hand, the general term of this sériés satisfies the inequality 

_1_1 = 1 J_ 

n — 1 n (n — 1) n > n 2 ’ 

from which it follows that the sériés 

I I 1 

1 + 22 + 32 + 42 + "' 

converges. Ail the more then will the sériés (56) converge with a > 2. 

Let us give here without proof another useful criterion for convergence 
and divergence of sériés with positive terms, the so-called criterion of 
d’Alembert. 

Let us suppose that, as n approaches infinity, the ratio ( u n + 1 )/u n 
has a limit q. Then for q < I the sequence will certainly converge, while 
for q > 1 it will diverge. But for q = 1 the question of its convergence 
remains open. 

The sum of a finite number of summands does not change if we permute 
the summands. But in general this is no longer true for infinité sériés. 
There exist convergent sériés for which it is possible to permute the terms 
in such a way as to change their sum and even to turn them into divergent 
sériés. Sériés with unstable sums of this sort fail to possess one of the 
fundamental properties of ordinary sums, permutability of the summands. 
So it is important to distinguish those sériés which preserve this property. 
It turns out that they are the so-called absolutely convergent sériés. 
The sériés 

«0 + “i + «2 + u 3 + "• 
is said to be absolutely convergent if the sériés 

I «0 I + I «1 I + I «2 I + I « 3 I + '•• 

of absolute values of its terms is also convergent. It is possible to prove 
that an absolutely convergent sériés is always convergent; in other words, 
that its partial sums S n approach a finite limit. It is obvious that every 
convergent sériés with terms of one sign is absolutely convergent. 

The sériés 

sin x t sin 2x sin lx 
j2 2 2 y + 

is an example of an absolutely convergent sériés, since the terms of the 
sériés 


sin x 

, sin 2x 


sin 3x 

l 2 

+ 2 2 

+ 

y 


+ •" 
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are not greater than the corresponding terms of the convergent sériés 

1 + 2ï + 3* + • 

An example of a sériés which is convergent, but not absolutely con¬ 
vergent, is the following 

1 - l 2 + \-\ + - 
as the reader may prove for himself. 


Sériés of functions; uniformly convergent sériés. In analysis we often 
hâve to do with sériés whose terms are functions of x. In the preceding 
paragraphs we hâve already had examples of this sort, for instance, the 
sériés 1 + x + x 2 + x 3 + For some values of x this sériés converges, 
but for others it diverges. Particularly important in applications are 
sériés of functions convergent for ail values of x belonging to a certain 
interval, which may in particular be the whole of the real axis or the 
positive half of it and so forth. Then the necessity arises for differentiating 
such sériés term by term, integrating them, deciding whether their sum 
is continuous, and so forth. For the familiar case of the sum of a finite 
number of terms, there are simple general rules. We know that the déri¬ 
vative of a sum of différentiable functions is equal to the sum of their 
dérivatives, the intégral of a sum of continuous functions is the sum of 
their intégrais, and a sum of continuous functions is itself a continuous 
function: Ail this holds for the sum of a finite number of terms. 

But for infinité sériés these simple rules are in general no longer true. 
We could give many examples of convergent sériés of functions for which 
the rules of termwise intégration and différentiation are false. In the 
same way a sériés of continuous functions may turn out to hâve a dis- 
continuous sum. On the other hand many infinité sériés behave like 
finite sums with respect to these rules. 

Profound investigations of this question hâve shown that these rules 
may still be applied if the infinité sériés in question are not only convergent 
at each separate point of the interval of définition (the domain over 
which x varies) but if they are uniformly convergent over the whole 
interval. In this way there was crystallized in mathematical analysis, in 
the middle of the 19th century, the important concept of the uniform 
convergence of a sériés. 

Let us consider the sériés 


5(x) = Ho(x) + ufx) + u 2 (x) + —, 
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whose terms are functions defined on the interval [a, 6]. We suppose 
that for each separate value of x in the interval this sériés converges to 
a certain sum S(jt). The sum of the first n terms of the sériés 

S n (x) = u 0 (x) + u,(at) + ••• + u„- t (x) 

is also a certain function of x, defined on [a, b]. 

We now introduce a magnitude -r) n , which is equal to the least upper 
bound of the values* | S(x) — •£„(*) I, as x varies on the interval [a, b]. 
This magnitude is written as followst 

7j n = sup | S B (x) - S(x) |. 

a^x^b 

In case the quantity S(x) — S„(x) attains its maximum value, which will 
certainly occur for example, when 5 (jx) and S„(x) are continuous, then 
rj„ is simply the maximum of | S(x) — S„(x)| on [a, b). 

From the assumed convergence of our sériés, we hâve for every individual 
value of x in the interval [a, b] 

lim | S(x) - S n (x) | = 0. 

n-*oo 

But the magnitude tj„ may approach zéro or it may not. If tj„ approaches 
zéro as n —► co, then the sériés is said to be uniformly convergent, and 
in the opposite case nonuniformly convergent. In the same sense it is 
possible to speak of the uniform or nonuniform convergence of a sequence 
of functions 5 n (x) without necessarily interpreting them as partial sums 
of a sériés. 


Example 1. The sériés of functions 


_l_1_1_ 

x + 1 (x+ l)(x + 2) (x + 2)(x + 3) 

which we take to be defined only for nonnegative values of x, namely 
on the half line [0, co), may be written in the form 


TT + f 


1 


1 


r) + (- 


i 


i 


* + l \ x + 2 x+l' Vjr + 3 x + 2 

from which we see that its partial sums are equal to 


-) + -, 


and 


S„{x) = 


_1 _ 

x + n 


lim S„(x) = 0. 


* See Chapter XV. 

tsup is an abbreviation for the Latin word suprenium (highest). 
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Thus the sériés is convergent for ail nonnegative x and has the sum 
S(x) = 0. Furthermore, 

Vn = sup I S„(x) -S(a:)| = sup - * - - = — 0(n— co), 
o<*<co o^*<oo x 4 - n n 

so that the sériés is uniformly convergent to zéro on the half axis [0, co). 
Figure 37 shows the graphs of some of the partial sums S„(at). 

Example 2. The sériés 

x + x(x - 1) + x*(x - 1) + ••• 


may be written in the form 

x 4- (x 2 — x) + (x 3 — x 2 ) + 

from which 


and therefore 


$.(■*) = jc", 


lim S„(x) = 

n -*cc 


lO, if 0 ^ x < 1 ; 
>1,if* = 1. 


Thus the sum of the sériés is discontinuous on the interval [0, 1] with a 
discontinuity at the point * = 1. The quantity | S„(x) — S(x)| is less 
than unity for every * in [0, 1] but for * close to * = 1 it is arbitrarily 
close to unity. So, 

T) n = sup | $„(*) - S(x) | = 1 


for ail n = 1,2, . Thus the sériés is nonuniformly convergent on the 


interval [0, 1]. Figure 38 shows 
some of the graphs of the 
function S„(x). The graph of 
the sum of the sériés consists 
of the segment 0 ^ < 1 of 

the x-axis omitting the right 
end point and of the point (1, 1). 

y 

I 
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This example shows that the sum of a nonuniformly convergent sériés 
of continuons functions may in fact be a discontinuons function. 

On the other hand, if we consider the sériés on the interval 0 ^ x ^ q 
with q < 1, then 

»?„ = sup | S n (x) - S(.r) | = max = q n -* 0, 

o ri-Ko 


so that on this interval the sériés converges uniformly and its sum is 
continuous. The fact that the sum of a uniformly convergent sériés of 
continuous functions is itself a continuous function is a general rule, as 
was pointed out earlier, which can be rigorously proved. 

Example 3. The sum of the first n terms of the sériés S„(x) has the 
graph represented by the heavy broken line in figure 39. Obviously, for 
ail n we hâve S„(0) = 0, but if 0 < x ^ I, then for ail n ^ l/x, we will 
hâve S„U) = 0, and consequently for arbitrary x in the interval [0, 1], 

S(x) = lim S„(x) = 0. 

n-K© 


On the other hand, 

tj„ = sup | S„(x) - S(x) | = sup | S„(x) | = n\ 

0s;x^l 


So the quantity 7j„ does not approach zéro but even approaches infinity. 
We now note that the sériés corresponding to this sequence S„U) cannot 
be integrated term by term on the interval [0, 1], since 

r l r l lin 

SU) dx = 0, S„U) dx = = n* i = = , 

J q J q 2. n 1 

so that the sériés 


f S,U) dx + f* [S 2 (x) - S t (x)] dx + f' [S 3 (x) - S 2 (x)] dx + ••• 
J o •'o ■'o 

reduces to the divergent sériés 
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Let us State without proof the fundamental properties of uniformly 
convergent sériés: 


1. The sum of a sériés of 
continuous functions which is 
uniformly convergent on the 
interval [a, b] is a continuous 
function on this interval. 

2. If the sériés of continuous 
functions 

$(*) = «oM + «iW + u 2 (x) + — 

(57) 



converges uniformly on the Fig. 39. 

interval [a, b], then it may be 

integrated term by term on this interval; that is, for ail , x 2 in [a, b] 
we hâve the equality 


J ' S(t)dt = | * w 0 (')<rt + / ' U\(t)dt + -• • 


3. If on the interval [ a , 6] the sériés (57) converges and the functions 
u k (x) hâve continuous dérivatives, then the equality 

$'(*) = u 0 (x) + «/;(*) + u 2 (x) + , (58) 

obtained by termwise différentiation of (57) will be valid on the interval 
[o, b] if the sériés on the right in (58) converges uniformly. 

Power sériés. In §9, a function f(x) defined on an interval [a, b] was 
called analytic, if on this interval it has dérivatives of arbitrary order 
and if in a sufficiently small neighborhood of any point x 0 of the interval 
[a, b] it may be expanded in a convergent Taylor sériés 


/(*) = f(x 0 ) + (x - x 0 ) + (x - x 0 ) 2 + •• 


If we introduce the notation 


a n 


f tn \x o) 

..I ’ 


(59) 


this sériés may be written in the following form 


f(x) = fl 0 + a,(x - x 0 ) + fl 2 (x - x 0 ) 2 + 


(60) 




176 


H. ANALYSIS 


A sériés of this sort, where the numberso,, a 2 , ••• are constants independent 
of x, is called a power sériés. 

As an example let us consider the power sériés 

1 + x + x 2 4- x 3 + —, (61) 

whose terms form a géométrie progression. 

We know that for ail values of x in the interval —1 < x < 1 this 
sériés converges and its sum is equal to 

For other values of x the sériés diverges. 

It is also easy to see that the différence between the sum of the sériés 
and the sum of its first n terms is given by the formula 

S(x) - S n (x) = , (62) 

and if — q ^ x < q, where q is a positive number less than unity, then 

Vn = max | S(jr) - S^x) | = —$-—. 

1 - q 

From this it is clear that r/ n approaches zéro with increasing n so that 
the sériés is uniformly convergent on the interval —q^x^q, for ail 
positive values of q < 1. 

It is easy to verify that the function 


S(x) = 



has a dérivative of nth order, which is equal to 


from which 


$<"'(*) 


ni 

(1 - T )" +1 ’ 


S (n, (0) = ni 


and the sum of the first n terms of the Taylor sériés for the function S(x) 
exactly coincides for x 0 = 0 with the sum of the first n terms of the 
sériés (59). Moreover, we know that the remainder term of the formula, 
given by the equality (62), approaches zéro with increasing n, for ail x 
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on the interval —1 < x < 1. Thus we hâve shown that the sériés (61) 
is the Taylor sériés of its sum 5(;c). 

Let us note one further fact. From the interval of convergence 
— 1 < x < 1 of our sériés, let us choose an arbitrary point x 0 . It is 
easy to see that for ail x sufficiently close to x 0 , namely for ail x satisfying 
the inequality 


I ^ ~ *o I 

1 — x n 


< 1, 


we hâve the equality 


S(x) = 


1 


1 - * 


C-T^> 


— 1 — X„ ' 1 — XJ J 


1 — 
1 


*0 

+ X - X 0 + (* - Xp? + 


i (i -x 0 y (i - x 0 y 


(63) 


The reader may prove without difficulty that 

y’W = 1 

n\ (1 — jt 0 )" +1 


Consequently, sériés (63) is the Taylor sériés of its sum $(*) and converges 
to it in a sufficiently small neighborhood of any point x 0 belonging to 
the interval of convergence of (61). Since the point x 0 is arbitrary, this 
means that the function S(.x) is analytic on the interval. 

Ail these facts that we hâve observed for the particular power sériés (61) 
hold for arbitrary power sériés.* Namely, for every power sériés of the 
form (60) where the constants a k are chosen by any given law, there 
exists a certain nonnegative number R (which may also be infinité), called 
the radius of convergence of the sériés (60), with the following properties: 

1. For ail values of x from the interval x 0 — R < x < x 0 + R, which 
is called its interval of convergence, the sériés converges and its sum 
S(x) is an analytic function of x in its interval. Here the convergence is 
uniform for every interval [a, b] lying completely within the interval of 
convergence. The sériés itself is the Taylor sériés of its sum. 

2. At the end points of the interval of convergence, the sériés may 
converge or diverge, depending on its individual character. But it will 
certainly diverge outside the closed interval x 0 — R < x < x 0 + R. 

* For more detailed information on this point see Chapter IX. 
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We suggest to the reader that he consider the power sériés 

X AT 2 X 3 

1 + T + 2! + 3! + 
l + x + 2\x 2 + 31* 3 + •••, 

X 2 X 3 

1 + * + y + y + "• 

and convince himself that their radii of convergence are respectively 
infinité, zéro, and unity. 

By the définition given earlier every analytic function may be expanded, 
in a sufficiently small neighborhood of an arbitrary point where it is 
defined, into a power sériés which converges to the function. Conversely, 
from what has been said it follows that the sum of every power sériés 
whose radius of convergence is not zéro is an analytic function in its 
interval of convergence. 

So we see that power sériés are organically connected with analytic 
functions. We could even say that on their interval of convergence power 
sériés are the natural means of representing analytic functions, and 
consequently they are also the natural means of approximating analytic 
functions by algebraic polynomials.* 

For example, from the fact that the function 1/(1 - x) may be expanded 
in the power sériés 

[ x = 1 + x + x 2 + x* + •••, 


which is convergent on the interval —1 < x < 1, it follows that the 
power sériés is uniformly convergent on an arbitrary interval —a ^ x < a 
with a < 1, and this implies the possibility of approximating the function 
on the whole interval [—a, a] by the partial sums of the sériés with any 
preassigned degree of accuracy. 

Let us suppose that we are required to approximate the function 
1/(1 — x) by polynomials on the interval [- £, £] with an accuracy of 0.01. 
We note that for ail x in this interval we hâve the inequality 


1 - * 


- 1 - x - ■■■ - x n 


= | x n+1 + x”* 2 + | 


^irr + ur + -<i+ 1 


2>i+! 1 2 n + 2 2" ’ 


* Approximations going beyond the lïmits of the interval of convergence of a power 
sériés require other methods. (See Chapter Xtf.) 
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and since 2 e = 64, and V — 128, the desired polynomial, approximating 
the function on the whole interval [-£,£] with an accuracy of 0.01, 
will hâve the form 


1 


1 — x 


«= 1 + X + X 2 + + x\ 


Let us note one further extremely valuable property of power sériés: 
They may be differentiated termwise everywhere in the interval of con¬ 
vergence. This property finds extremely wide application in the solution 
of various problems in mathematics. 

For example, let it be required to find the solution of the differential 
équation y' = y under the auxiliary condition ><0) = 1. We will seek the 
solution in the form of a power sériés, 

y = a 0 + a t x + a^c 2 + •••. 

Because of the auxiliary condition, we must set a„ = 1. Assuming that 
this sériés converges, we may differentiate it termwise; as a resuit we 
obtain 

y' = a, + 2ûîX + 3 a 3 x* 4- 

If we substitute these two sériés into the differential équation and equate 
coefficients for each of the powers of x, we obtain 

«* = ^ ( * = ‘> 2 ’ -> 


and the desired solution has the form 


X X 2 X 3 

y= ï +- l + 2\ + T\ + '"‘ 


It is well known that this sériés converges for ail values of x and that its 
sum is equal to y — e x . 

In this case we hâve obtained a sériés whose sum is a well-known 
elementary function. But this does not always happen; it may turn out 
that a convergent power sériés so obtained has a sum that is not an 
elementary function. An example is the sériés 

y v (x) = [l - 2(2 p + 2) + 2 4(2 p + 2X2 p + 4) ~ "1 ’ 

obtained as a solution of Bessel’s differential équation, which is of great 
importance in applications. In this way power sériés may serve to défi ne 
functions. 
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CHAPTER 


III 


ANALYTIC GEOMETRY 


§1. Introduction 

In thefirst half ofthe I7th century a completely new branch of mathe- 
matics arose, the so-called analytic geometry, establishing a connection 
between curves in a plane and algebraic équations in two unknowns. 

A quite rare event thereby happened in mathematics: In one or two 
décades there appeared a great, entirely new branch of mathematics 
based on a very simple concept, which until then had not received proper 
attention. The appearance of analytic geometry in the first half of the 
17th century was not accidentai. The transition in Europe to the new 
capitalistic methods of manufacture required the advance of a whole 
sériés of sciences. A short time before, contemporary mechanics was 
being created by Galileo and other scientists, experimental data were 
being accumulated in ail régions of natural science, the means of observa¬ 
tion were being perfected, and instead of absolete scholastic théories new 
ones were being created. In astronomy, among the foremost scientists 
the teachings of Copernicus had finaily triumphed. The rapid development 
of long-range navigation insistently called for knowledge of astronomy 
and the éléments of mechanics. 

The art of warfare also required mechanics. Ellipses and parabolas, 
whose géométrie properties as conic sections were already well known 
in detail to the ancient Greeks almost 2000 years earlier, ceased to be 
only part of geometry, as they were to the Greeks. After Kepler had 
discovered that the planets revolve around the sun in ellipses, and Galileo 
that a stone thrown into the air traces out a parabola, it was necessary 
to calculate these ellipses and to find the parabolas along which bullets 
fly from a gun; it was necessary to discover the law by which the at- 
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mospheric pressure, discovered by Pascal, decreases with the height; it 
was necessary actually to calculate the volumes of various bodies, and 
so forth. 

Ail these questions almost simultaneously called to life three entirely 
new mathematical sciences: analytic geometry, differential calculus, and 
intégral calculus, including the solution of the simplest differential 
équations. 

These three new fields qualitatively changed the face of the whole of 
mathematics. They made it possible to solve problems never even dreamed 
of before. 

In the first half of the 17th century, i.e., at the beginning of the 1600’s, 
a group of the most outstanding mathematicians was already close to 
the idea of analytic geometry, but there were two of them, in particular, 
who understood clearly the possibility of creating a new branch of 
mathematics. These were Pierre Fermât, a counsellor of the parliament 
of the French city of Toulouse and a world-famous mathematician, and 
the famous French philosopher René Descartes. Descartes is credited 
with being the chief creator of analytic geometry. He was the one who, 
as a philosopher, raised the question of its complété generality. Descartes 
published the great philosophical treatise “Discourse on the method of 
rightly conducting the reason and seeking the truth in the sciences, with 
applications: dioptries, meteorology and geometry.” 

The last part of this work, entitled “Geometry” and published in 1637, 
contains a sufficiently complété, although somewhat confusing, présenta¬ 
tion of the mathematical theory that since then has been called analytic 
geometry. 


§2. Descartes’ Two Fundamental Concepts 

Descartes wished to create a method that could equally well be applied 
to the solution of ail problems of geometry, that is, which would provide 
a general method for their solution. Descartes’ theory is based on two 
concepts: the concept of coordinates and the concept of representing by 
the coordinate method any algebraic équation with two unknowns in the 
form of a curve in the plane. 

The concept of coordinates. By the coordinates of a point in the plane 
Descartes means the abscissa ând ordinate of this point, i.e., the numerical 
values x and y of its distances (with corresponding signs) to two mutually 
perpendicular straight Unes (coordinate axes) chosen in this plane (see 
Chapter II). The point of intersection of the coordinate axes, i.e., the 
point having coordinates (0, 0) is called the origin. 
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With the introduction of coordinates Descartes constructed, so to 
speak, an “arithmetization” of the plane. Instead of determining any 
point geometrically, it is sufficient to give a pair of numbers x, y and 
conversely (figure 1). 

The notion of comparison of équations with two unknowns with curves 
in the plane. Descartes’ second concept is the following. Up to the time 
of Descartes, where an algebraic équation in two unknowns F(x, y) = 0 
was given, it was said that the problem was indeterminate, since from 
the équation it was impossible to détermine these unknowns; any value 
could be assigned to one of them, for example to x, and substituted in 
the équation; the resuit was an équation with only one unknown y, for 
which, in general, the équation could be solved. Then this arbitrari/y 
chosen x together with the so-obtained y would satisfy the given équation. 
Consequently, such an “indeterminate” équation was not considered 
interesting. 

Descartes looked at the matter differently. He proposed that in an 
équation with two unknowns x be regarded as the abscissa of a point 
and the corresponding y as its ordinate. Then if we vary the unknown x, 
to every value of x the corresponding y is computed from the équation, 
so that we obtain, in general, a set of points which form a curve (figure 2). * 




F(-2,y.g)-0 

n-1-O 

F(0,y a ) = 0 
Fd,y,)--0 
F(2,yg)=0 

Fig. 2. 


*Sometimes, the équation is not satisfied by any point (x,y) with real coordinates, 
sometimes by one or a few such points. In this case we say that the curve is imaginary or 
reduces to points (see §7). 
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Thus, to each algebraic équation with two variables, F(x,y) = 0, 
corresponds a completely determined curve of the plane, namely a curve 
representing the totality of ail those points of the plane whose coordinates 
satisfy the équation F(x, y) = 0. 

This observation of Descartes opened up an entire new science. 

The basic problems solved by analytic geometry and the définition of 
analytic geometry. Analytic geometry provides the possibility: (1) of 
solving construction problems by computation (see for example, the 
division of a segment in a given ratio, see §3); (2) of finding the 
équation of curves defined by a géométrie property (for example, of 
an ellipse defined by the condition that the sum of distances to two 
given points is constant, see §7); (3) of proving new géométrie theorems 
algebraically (see, for example, the dérivation of Newton’s theory of 
diameters, §6); (4) conversely, of representing an algebraic équation 
geometrically, to clarify its algebraic properties (see, for example, the 
solution of third- andfourth-degreeéquations from the intersection of a 
parabola with a circle, §5). 

Thus, analytic geometry is that part of mathematics which, applying 
the coordinate method, investigates géométrie objects by algebraic means. 


§3. Elementary Problems 


The coordinates of a point that divide a segment in a given ratio. Given 
the coordinates (x, ,>»,) and ( x 2 ,y 2 ) of two points M x and M 2 , let us 
find the coordinates (x, y) of the point M dividing the segment M X M 2 in 
the ratio m io n (figure 3). From the similarity of the shaded triangles 
we obtain: 



—-— = — , from which 

x 2 — x n 

_ nx, + mx 2 


- -— = — , from which 

y z - y « 

_ "J'i + my t 
m + n ‘ 

Distance between two points. 
Let us find the distance between 
the points M x and M 2 , whose 
coordinates are (x,, y x ) and 


Fig. 3. 
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(x 2 . >" 2 ) respectively. From the shaded right triangle (figure 4), we obtain 
by the theorem of Pythagoras 


d = V(x 2 - x ,) 2 + (y 2 - J’i) 2 - 

The area of a triangle. Let us find the area 5 of the triangle 
M t M 2 M 3 (figure 5) if the coordinates of its vertices are respectively 
(*1 > ^ 1 ) (*2 . yî), (* 3 . ^s)- Considering the area of the triangle as the sum 
of the areas of trapezoids with bases y, , y 3 and y 3 , y 2 minus the area of the 
trapezoid with bases y, , y 3 and writing the product — (.y, + y 2 )(x 2 — jt,) 
in the form (>>, + y 2 )(x 3 — ;c 2 ), we obtain 

S = t [(>-1 + ^2X^1 — x 2 ) + (y 2 + y 3 )(x 2 — x 3 ) + ( y 3 + _y,X *3 — ^i)]- 



In these problems it only remains to verify that the derived formulas 
remain valid without any change in those cases when one or more coor¬ 
dinates or their différences are négative. Such vérification easily follows. 

Détermination of the points of intersection of two curves. Relying on 
the second fundamental idea that the équation F(x,y) = 0 represents a 
curve, it is particularly simple to find the points of intersection of two 
curves. In order to find the coordinates of the points of intersection of 
two curves, it is obviously necessary to solve simultaneously the équations 
that represent them. The pair of numbers x, y obtained from the ordinary 
solution of these two équations will détermine the point whose coordinates 
satisfy both of the équations, i.e., the point that lies on the first as well 
as on the second curve, and this is the point of their intersection. 
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The solution of géométrie problems by the tools of analytic geometry, 
as we see, is very convenient for practical purposes, especially because 
every solution is at once obtained in the convenient form of numbers. 
Such a geometry , such a science, was exactly h hat was lacking at that time. 


§4. Discussion of Curves Represented by First- and Second-Degree 
Equations 

First degree équation. Making use of his second idea, Descartes first 
of ail examined what curves correspond to an équation of the first-degree, 

Ax + By -(- C = 0, (1) 

i.e., to an équation where A, B, C are numerical coefficients with A and B 
not both zéro. It turned out that in the plane a straight line always 
corresponds to such an équation. 

We shall prove that équation (1) always represents a straight line, and 
conversely, that to every line in the plane there corresponds a completely 
determined équation of the form (1). In fact, let us suppose, for example, 
that 5^0; then équation (1) can be solved for y 


y = kx + I, 


where k = 


-±■1 
B ’ 


C 
B ' 


We examine first the équation y = kx. It obviously represents a 
straight line passing through the origin and making an angle <j> with the 
x-axis whose tangent tan <j> is k (figure 6). Indeed, the équation can be 
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written as y/x = k, so that the coordinates of every point (x, y) on the 
straight line satisfy the équation, and the coordinates of no point ( x,ÿ ) 
not lying on the straight line satisfy the équation, since for such a point 
y/x will be either greater than or smaller than k. In addition, if tan<f> > 0, 
then for this line either both x and y are positive or both négative, and 
if tan <f> < 0 their signs are opposite. 

Thus, the équation y = kx represents a straight line passing through 
the origin O, and consequently the équation y = kx + I also represents 
a line, namely the one which is obtained from the previous line by the 
parallel translation such that the ordinate of each of its points is increased 
by / (figure 7). 

The earlier derived formulas of the coordinates of a point dividing a 
segment in a given ratio, the distance between two given points, and the 
area of a triangle as well as the information about the équation of a 
straight line already enable us to solve a large number of problems. 


The équation of a straight line passing through one or two given points. 

Let M y be the point with coordinates x t , y, and let k be a given number. 
The équation y = kx + / represents a straight line making with the 
Ox-axis an angle whose tangent is equal to k and intersecting the Oy-axis 
at a distance / from O. Let us choose / such that this line goes through 
the point (x, ,y,). For this, the coordinates of the point My must satisfy 
the équation, i.e., we must hâve y, = kxy + /, from which / = y, — kx v 




Substituting this value for /, we obtain the équation of the line that 
passes through the given point (x, ,y,) and makes with the Ox-axis 
an angle whose tangent is equal to k (figure 8). This équation is 
y = kx + y, — kxy or 


y — yy = k(x — Xy). 
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Example. Let the angle between the line and the Ctor-axis be equal 
to 45°, and let the point M hâve coordinates (3, 7); then the équation of 
the corresponding line (since tan 45° = 1) will be: y — 7 = 1 • (x — 3) 
or x — j» + 4 = 0. 

If we require that the line passing through the point (x,,^,) also 
go through the point (.x 2 ,yj, it follows that the condition y 2 — y t = 
k(x 2 — x,) must be imposed on k. Finding k from this and substituting 
it in the previous équation, we obtain the équation of the line passing 
through two given points (figure 9): 

x - x t = y - y, 
x t - a:, y t - y x ' 

Descartes’ resuit concerning second-degree équations. Descartes also 
investigated the question as to what kinds of curves in the plane are 
represented by the second-degree équation with two variables whose 
general form is 

Ax 2 + Bxy + Cy 2 + Dx + Ey + F = 0, 

and showed that such an équation, generally speaking, represents an 
ellipse, a hyperbola, or a parabola; i.e., curves very well known to the 
mathematicians of antiquity. 

These are Descartes’ most important achievements. However, his book 
was far from being restricted to these topics; he also investigated the 
équations of a number of interesting géométrie loci, examined certain 
theorems on transformation of algebraic équations, mentioned without 
proof his famous law of signs for the number of positive roots of an 
équation whose roots are ail real (see Chapter IV, §4) and, finally, 
presented a remarkable method for determining the real roots of third- 
and forth-degree équations from the intersection of the parabola y = x 2 
with circles. 


§5. Descartes’ Method of Solving Third- and Fourth-Degree 
Algebraic Equations 

Transformation of third- and fourth-degree équations to an équation of 
the fourth-degree not involving the x 3 -term. We will show that the 
solution of an arbitrary third- or fourth-degree équation can be reduced 
to the solution of an équation of the form 




(2) 
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Let the given third-degree équation be z 3 + az 2 + bz + c = 0. Sub- 
stituting z — x — a/3, we obtain 

(x — a/3) 3 + a(x — a/3) 2 + b(x — a/3) + c = 0. 

The ^-ternis in the expansion of the parenthèses will cancel out, so that 
we get an équation of the form x 3 + px + q = 0. Multiplying this 
équation by x, we bring it to the form (2) with r = 0, which also admits 
a root * 4 = 0. 

An équation of the fourth-degree z 4 + az 3 + bz 2 + cz -f d = 0 can 
be reduced to the form (2) by the substitution z = x — a/4. Hence, the 
solution of ail third- and fourth-degree équations can be reduced to the 
solution of an équation of the form (2). 


The solution of third- and fourth-degree équations by the intersection 
of a circle with the parabola y = x 2 . Let us first dérivé the équation of 
a circle with center (a, b) and radius R. If (x, y) is any of its points, then the 
square of its distance to the point (a, b) is equal to ( x — a) 2 + (y — b) 2 
(see §3). Thus, the équation of the circle in question is 

(x — a ) 2 + (y — b) 2 = R 2 . 

Now we try to find the points of intersection of this circle with the 
parabola y = x 2 . In order to do this, by virtue of what was said in §3, 
it is necessary to solve simultaneously the équation of this circle 

x 2 + y 2 — 2 ax — 2by + a 2 + b 2 — R 2 = 0 
and the équation of the parabola 

y = x 2 . 

Substituting y from the second équation into the first, we obtain a 
fourth-degree équation in x: 

x 2 + x*~ 2ax— 2bx 2 + a 2 + b 2 — R 2 = 0 
or 

x* + (1 — 2b) x 2 — 2ax + a 2 + b 2 — R 2 = 0. 

If we choose a, b and R 2 such that 

1 — 2b — p, — 2a = q, a 2 + b 2 — R 2 = r, 
then exactly équation (2) is obtained. For this purpose we hâve to take 



1 -p 



d - P) 2 


a = 


2 


4 


— r. 


(3) 
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In the last formula (3), generally speaking, R 2 may turn out to be négative. 
However, in the case when équation (2) has even one real root x„ the 
following equality holds 

*î+0 - 2b) x\ - 2ax x + à 1 + b 2 - R 2 = 0. (4) 

Denoting x\ by y x , équation (4) can be rewritten as 

x \ + A ~ 2ax x ~ 2 by x + à 2 + b 2 — R 2 = 0 

or as 

(*, - a ) 2 + (y x - b ) 2 = R 2 . 

Hence, in the case when équation (2) has a real root, the number 
R 2 — [(1 — p) 2 + q 2 ]/ 4 — r is positive, the équation 

(x — a) 2 + (y — b) 2 = R 2 

is the équation of a circle, and ail real roots of équation (2) are the 
abscissas of points of intersection of the parabola y = x 2 with this circle. 
(In case r = 0, R 2 = a 2 + b 2 , this circle passes through the origin.) 

Thus, if the coefficients p, q, r of équation (2) are given, and it is 
necessary to find a, b and R 2 by formulas (3), then if R 2 < 0, équation (2) 
is known to hâve no real roots. But, if R 2 ^ 0 then the abscissas of the 
points of intersection of the circle with center (a, b) and radius R with 
the parabola y = x 2 (drawn once and for ail) give ali the real roots of 

équation (2); and also in case 
R 2 < 0. the resulting circle cannot 
intersect the parabola and équa¬ 
tion (2) does not hâve real roots. 

Example. Let the given fourth- 
degree équation be: 

x* -4x 2 +x + ^ = 0. 

Then we hâve 

1 , 5 ,1 



Figure 10 shows the corresponding 
circle and the roots x,, x 3 , x x , x x 
of the given équation. 



Fig. 10. 
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The lst, 2nd, 3rd and 4th sections above contain, in an abbreviated and 
somewhat more modem form, the essential content of Descartes’ book. 

From Descartes’ time up to the présent, analytic geometry has under- 
gone an immense development that has been very fruitful for many 
different parts of mathematics. We will attempt in the following sections 
of this chapter to trace the most important stages of this development. 

First of ail, it is necessary to say that the inventors of the infinitésimal 
analysis were already in possession of Descartes’ method. Whether it was 
a question of tangents or normals (perpendiculars to the tangents at the 
point of tangency) to curves, or of maxima or minima of functions 
considered geometrically, or of the radius of curvature of a curve at a 
given point, etc., the équation of the curve was considered first, by the 
method of Descartes, and then the équations of the normal, the tangent, 
and so forth, were found. Thus infinitésimal analysis, namely the dif- 
ferential and intégral calculus, would hâve been inconceivable without 
the preliminary development of analytic geometry. 


§6. Newton’s General Theory of Diameters 

The first mathematician to take a further great step forward in analytic 
geometry itself was Newton. In 1704 he examined the theory of third- 
order curves, i.e., curves which are represented by third-degree algebraic 
équations in two unknowns. At the same time he found, among other 
things, an élégant general theorem about “diameters,” which correspond 
to sécants in a given direction. He proved the following. 

Let an nth-order 
curve be given, i.e., a 
curve which is repre¬ 
sented by an nth-de- 
gree algebraic équation 
in two unknowns; then 
an arbitrary straight 
line intersecting it has 
in general n common 
points with it. Let M 
be the point of the sé¬ 
cant that is the “center 
of gravity” of these F1g. 11. 

points of its intersec¬ 
tion with the given nth-order curve, i.e., the center of gravity of a set 
of n equal point masses situated at these points. It turns out that if we 
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take ail possible sets of mutually parallel sécants and for each of them 
consider these centers of mass M, then for any given set of parallel 
sécants ail the points M lie on a straight line. Newton called this line 
the “diameter” of the nth-order curve corresponding to the given direction 
of the sécants. Since the proof of this theorem is quite easy with the 
help of analytic geometry, we will give it here. 

Let an nth-order curve be given and some set of mutually parallel 
sécants of the curve. Choose the coordinate axes so that these sécants 
are parallel to the Ox-axis (figure 11). Then their équations will hâve 
the form y = /, where the constant / will be different for different sécants. 
Let F(x , y) = 0 be the équation that represents the nth-order curve with 
respect to these coordinate axes. It is easy to show that under a trans¬ 
formation from one rectangular coordinate System to another, although 
the équation of the curve changes, its order does not change (this will be 
shown in §8). Therefore F(x, y) will also be an nth-degree polynomial. 
To détermine the abscissas of the points of intersection of the curve with 
the sécant y = I, it is necessary to solve the simultaneous équations 
F(x, y) = 0 and y = /; as a resuit, in general, an nth-degree équation 
in x is obtained 

F(x, I) = 0, (5) 

from which we find the abscissas x,, x 2 , •••, x n . The abscissa x e of the 
center of gravity of the n points of intersection is equal, by the very 
définition of center of gravity, to 

•*1 + X 2 + "■ + *n 

*' ~ n 

But, as is known from the theory of algebraic équations, the sum 
x, + x 2 + "• + x n of the roots of an équation is equal to the coefficient 
of the (n — l)th power of the unknown x, taken with the opposite sign, 
divided by the coefficient of the nth power of x. But because the sum 
of the exponents of x and y in every term of F(x, y) is equal to or less 
than n, the term in x n does not contain y at ail but has the form Ax n , 
where A is a constant; and if the terms in x"~' contain y, they do so to 
no higher than the first power; i.e., they hâve the form x "-'(By + C). 
Consequently, the coefficient of x n is A and that of x n ~ 1 is BI + C, and 
we hâve for any given / 

Bl + C 


But the sécant is parallel to the Ox-axis so that for ail of its points y = /, 
and hence the ordinate y of the center of gravity of the points of its 
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intersection with the given nth-order curve is also equal to I; thus finally 
we obtain nAx c + By c + C = 0, i.e., the coordinates x c ,y c of the 
centers of gravity for ail these sécants satisfy a first-degree équation, 
and consequently lie on a straight line. 



Fig. 12. 



The case when /^x,y) does not contain x n can be investigated anal- 
ogously. 

In case the curve is of the 2nd order (n ~ 2) the center of gravity of 
two points is simply the midpoint between them, so that the locus of 
midpoints of parallel chords of a second-order curve is a straight line 
(figure 12), a resuit that for the ellipse, as well as for the hyperbola and 
the parabola, was already well known to the ancients. But this was 
proved by them, even though only for these partial cases, with quite 
difficult géométrie arguments, and here a new general theorem, unknown 
to the ancients, is proved in an entirely simple way. 

Such examples reveal the power of analytic geometry. 


§7. Ellipse, Hyperbola, and Parabola 

In this and the following sections, we consider second-order curves. 
Before investigating the general second-degree équation, it is useful to 
examine some of its simplest forms. 

The équation of a circle with center at the origin. First of ail, we 
consider the équation 

x* + y 2 = o 2 . 

It evidently represents a circle with center at the origin and radius a, 
as follows from the theorem of Pythagoras applied to the shaded right 
triangle (figure 13), since whatever point ( x,y ) of this circle is taken, 
its x and y coordinates satisfy this équation, and conversely, if the 
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coordinates x, y of a point satisfy the équation, then the point belongs 
to the circle; i.e., the circle is the set of ali those points of the plane that 
satisfy the équation. 


The équation of an ellipse and its focal property. Let two points F, 
and F 2 be given, the distance between which is equal to 2c. We will find 
the équation of the locus of ail points M of the plane; the sum of whose 
distances to the points F, and F 2 is equal to a constant 2a (where, of 
course, a is greater than c). Such a curve is called an ellipse and the 
points F, and F 2 are its foci. 

Let us choose a rectangular coordinate System such that the points F, 
and F 2 lie on the 0*-axis and the origin is halfway between them. Then 
the coordinates of the points F, and F 2 will be (c, 0) and (— c, 0). Let us 
take an arbitrary point M with coordinates (x, y), belonging to the locus 
in question, and let us write that the sum of its distances to the points 
F, and F 2 is equal to 2a, 

V(x - c )* + (y - 0 )* + V(x + c) 2 + (y - 0 )* = 2a. ( 6 ) 


This équation is satisfied by the coordinates ( x,y) of any point of the 
locus under considération. Obviously the converse is also true, namely 
that any point whose coordinates satisfy équation (6) belongs to this 
locus. Equation (6) is therefore the équation of the locus, lt remains to 
simplify it. 

Raising both sides to the second power, we obtain 


x 2 -2cx + c 2 + y 2 + 2 V(x 2 -2ex + c 2 + y 2 )(x 2 + 2ex + c 2 + y 2 ) 


or after simplification 


+ x 2 + 2cx + c 1 + y 2 = 4a 2 , 


JC* + y 2 + C 2 - 2fl 2 = - V(x 2 + y 2 + c 2 ) 2 - 4c 2 x 2 . 


Squaring again both sides, we obtain 
(x 2 + / + c 2 ) 2 — 4a 2 (x 2 + y 2 + c 2 ) + 4a 4 = (x 2 + y 2 + c 2 ) 2 — 4c 2 ^ 
or after simplification 

(a 2 — c 2 ^ 2 + a 2 y 2 = (a 2 — c 2 )a 2 . 

Let us set a 2 — c 2 = b 2 (as may be done since a > c); then we obtain 
b 2 x 2 + a 2 y 2 = a 2 b 2 , and dividing by a 2 b 2 we hâve 
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The coordinates ( x , y) of any point M of the locus thus satisfy équa¬ 
tion (7). 

It can be shown on the other hand that if the coordinates of a point 
satisfy équation (7) then they also satisfy équation (6). Consequently, 
équation (7) is the équation of this locus, i.e., the équation of the ellipse 
(figure 14). 



This argument is a classical example of finding the équation of a curve 
given by some of its geometrical properties. 

The well-known method of tracing an ellipse by means of a thread 
(figure 15) is based on the property of the ellipse that the sum of the 
distances of any of its points to two given points is a constant. 

Remark. In order to détermine an ellipse, we could hâve taken, 
instead of the focal property considered here, any other géométrie 
property characteristic of it, for example, that the ellipse is the resuit of 
a “uniform contraction” of a circle toward one of its diameters or 
any other property. 

Substituting y = 0 in équation (7) of the ellipse, we obtain x = ±a, 
i.e., a is the length of the segment OA (see figure 14), which is called 
the major semiaxis of the ellipse. Analogously, substituting x = 0, we 
obtain y = ±b, i.e., b is the length of the segment OB, which is called 
the minor semiaxis of the ellipse. 

The numb er e = c/a is called the eccentricity of the ellipse, so that, 
since c = Va 2 — b 2 < a, the eccentricity of an ellipse is less than 1. 
In the case if a circle, c = 0 and consequently « = 0; both foci are at 
one point, the center of the circle (since OF , = OF z = 0), but the previous 
method of drawing the curve with a thread is still valid. 
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Laws of planetary motion. In studying Tycho Brahe's long-continued 
observations on the motion of the planet Mars, Kepler discovered that 
the planets revolve around the Sun in ellipses such that the Sun occupies 



one focus of the ellipse (the other focus 
remains unoccupied and plays no part in 
the motion of a planet around the Sun) 
(figure 16). Kepler also observed that the 
focal radius p in equal times sweeps out 
sectors of equal areas,* and Newton 
showed that the necessity of such a motion 
follows mathematically from the law of 
inertia (proportionality of accélération to 
force) and the law of universal gravitation. 


Fig. 16. 


The ellipse of inertia. As an example 
of the application of the ellipse in a technical problem, we consider the 
so-called ellipse of inertia of a plate. 

Let the plate be of uniform thickness and homogeneous material, for 
example a zinc plate of arbitrary shape. We rotate it around an axis in 
its plane. A body in rectilinear motion has, as is well known, an inertia 
with respect to this rectilinear motion that is proportional to its mass 
(independently of the shape of the body and the distribution of the mass). 
Similarly, a body rotating around an axis, for instance a flywheel, has 
inertia with respect to this rotation. But in the case of rotation, the 
inertia is not only proportional to the mass of the rotating body but 


O? 



Fig. 17a. 


Fig. 17b. 


* The eccentricities of planetary orbits are not very large, so that the orbits of 
planets are almost circles. 



§7. ELLIPSE, HYPERBOLA, AND PARABOLA 


199 


also dépends on the distribution of the mass of the body with respect 
to the axis of rotation, since the inertia with respect to rotation is greater 
if the mass is farther from the axis. For example, it is very easy to bring 
a stick at once into fast rotation around its longitudinal axis (figure 17a). 
But if we try to bring it at once to fast rotation around an axis perpen- 
dicular to its length, even if the axis passes through its midpoint, we will 
find that unless this stick is very light, we must exert considérable effort 
(figure 17b). 

It is possible to show that the inertia of a body with respect to 
rotation about an axis, the so-called moment of inertia of the body 
relative to the axis, is equal to S r]m, (where by S r 2 ,m ( we mean the 
sum r\m x + r\m 2 + ••• + r\m n and think of the body as decomposed 
into very small éléments, with m t as the mass of the /th element and r t 
the distance of the /th element from the axis of rotation, the summation 
being taken over ail éléments of the body). 

Let us return to our plate. Let O (figure 18) be a point of this plate. 
We consider the moments of inertia J u of the plate relative to an axis 




of rotation u passing through O and lying in the plane of the plate. For 
this purpose we take the point O as the origin of a Cartesian coordinate 
System and choose arbitrary axes Ox and Oy in the plane of the plate; 
then we will characterize the axis of rotation u by the angle <f> which it 
makes with the Ox-axis. It is easy to see (figure 19) that 


r t = l(*. tan <f> — y ( ) cos <£ | — \x { sin <f> — y, cos <f> \. 
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Hence 

2 r^rrit = 2 (x] sin 2 <j> — 2x t y t sin <f> cos <j> + y 2 , cos 2 <f>) m t 
= sin 2 <f> 2 x\m t — 2 sin <f> cos <f> 2 x,y t m t + cos 2 f> 2 y 2 Wj. 

The quantities sin 2 <£, 2 sin <£ cos <f>, and cos 2 ^ are taken outside the sum- 
mation sign, since they are constant for a given axis u. We now write 

2 x]m t = A, — 2 xtftm, = B, 2 f t m, = C. 

The quantities A, B , and C do not dépend on the choice of the axis u, 
but only on the shape of the plate, the distribution of its mass, and the 
fixed choice of the coordinate axes Ox and Oy. Consequently, 

J u = A sin 2 ^ + 2flsin<^cos^ + Ccos 2 <f>. 

We consider ail possible axes u in the plane of the plate passing through 
the point O and lay off on each of these axes from the point O a length 
equal to p, the inverse of the square root of the moment of inertia J u of 
the plate relative to that axis, i.e., p = 1/VTi • Then we obtain 


But 


= ‘ 4 s ‘ n2 + 25 sin <f> cos <f> + C cos 2 <f>. 


x = p cos <f>, y = p sin <f >, 

so that the équation of this locus has the following form: 

Cx 2 + 2 Bxy + Ay 2 = I. 

A second-order curve is obtained that is evidently finite and closed; 
i.e., it is an ellipse (figure 20), since ail other second-order curves, as we 
will later show, are either infinité or reduce to one point. 

The following remarkable resuit is 
obtained: Whatever may be the form 
and size of a plate and the distribution 
of its mass, the magnitude of its moment 
of inertia (more precisely, of the 
quantity p inversely proportional to the 
square root of the moment of inertia) 
with respect to the various axes lying 
in the plane of the plate and passing 
through the given point O, is charac- 
terized by a certain ellipse. This ellipse 
is called the ellipse of inertia of the 
Fig. 20. plate relative to the point O. If the 
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point O is the center of gravity of the plate, then the ellipse is called 
its central ellipse of inertia. 

The ellipse of inertia plays a great rôle in mechanics; in particular, it 
has an important application in the strength of materials. In the theory 
of strength of materials, it is proved that the résistance to bending of a 
beam with given cross section is proportional to the moment of inertia 
of its cross section relative to the axis through the center of gravity of 
the cross section and perpendicular to the direction of the bending force. 
Let us clarify this by an example. We assume that a bridge across a 
stream consists of a board that sags under the weight of a pedestrian 
passing over it. If the same board (no thicker than before) is placed 
“on its edge,” it scarcely bends at ail, i.e., a board placed on its edge is, 
so to speak, stronger. This follows from the fact that the moment of 
inertia of the cross section of the board (it has the shape of an elongated 
rectangle that we may think of as evenly covered with mass) is greater 
relative to the axis perpendicular to its long side than relative to the axis 
parallel to its long side. If we set the board not exactly fiat nor on edge 
but obliquely, or even if we do not take a board at ail but a rod of 
arbitrary cross section, for example a rail, the résistance to bending will 
still be proportional to the moment of inertia of its cross section relative 
to the corresponding axis. The résistance of a beam to bending is therefore 
characterized by the ellipse of inertia of its cross section. 

For an ordinary rectangular beam this ellipse will hâve the form shown 
in figure 21. The rigidity of such a beam under a load in the direction 
of the Oz-axis is proportional to bh 3 . 

Steel beams often hâve a J -shaped cross section ; for such beams the 
cross section and the ellipse of inertia are represented in figure 22. The 
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greatest résistance to bending is in the z direction. When they are used, 
for example as roof rafters under a load of snow and their own weights, 
they work directly against bending in a direction close to this most 
advantageous direction. 

The hyperbola and its focal property. Now we consider the équation 



representing a curve which is called a hyperbola. If we dénoté by c a numbcr 
such that c 2 = a 2 + b 2 , then it is possible to show that a hyperbola is 
the locus of ail points the différence of whose distances to the points F, 
and F 2 on the Oxr-axis with abscissas c and —c is a constant: p 2 — p, = 2a 
(figure 23). The points F, and F 2 are called the foci. 



The parabola and its directrix. Finally, we consider the équation 

y 2 = 2 px 

and call the corresponding curve a parabola. The point F lying on the 
Oxr-axis with abscissa pli is called the focus of the parabola, and the 
straight line y = — p/2, parallel to the Oy-axis, is its directrix. Let M be 
any point of the parabola (figure 24), p the length of its focal radius MF, 
and d the length of the perpendicular dropped from it to the directrix. 
Let us compute p and d for the point M. From the shaded triangle we 
obtain p 2 = (x — p/2) 2 + y 2 . As long as the point M lies on the parabola, 
we hâve y 2 = 2 px, hence 

p‘ = (x-ÿ + 2 px=(x+ï) 2 . 
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But directly from the figure it is clear that d = x + p/2. Therefore 
p 2 = d 2 , i.e., p = d. The inverse 
argument shows that if for a given 
point we hâve p = d, then the point 
lies on the parabola. Thus a parabola 
is the locus of points équidistant 
from a given point F (called the 
focus) and a given straight line d 
(called the direct rix). 

The property of the tangent to a 
parabola. Let us examine an im¬ 
portant property of the tangent to a 
parabola and its application in optics. 

Since for a parabola y 2 = 2 px we hâve 2 y dy = 2 p dx, it follows that the 
dérivative, or the slope of the tangent, is equal to dy/dx = tan<£ = p/y 
(figure 25). 

On the other hand, it follows directly from the figure that 



But 


tan y = 


y 

x-pl 2 ' 


tan 2 <f> = 


2 p/y _ = 2 py = 2 py 

1 _ p 2/y2 yZ _ pZ 2 px - p 2 


y 

x-pl 2' 


i.e., y = 2<f>, and since y = <f> + tfi, therefore 4> = <t>- Consequently, by 
virtue of the- law (angle of incidence is equal to angle of reflection) a 
beam of light, starting from the focus F and reflected by an element of 
the parabola (whose direction coincides with the direction of the tangent) 
is reflected parallel to the Ox-axis, i.e., parallel to the axis of symmetry 
of the parabola. 

On this property of the parabola is based the construction of reflecting 
télescopes, as invented by Newton. If we manufacture a concave mirror 
whose surface is a so-called paraboloid of révolution, i.e., a surface 
obtained by the rotation of a parabola around its axis of symmetry, then 
ail the light rays originating from any point of a heavenly body lying 
strictly in the direction of the “axis” of the mirror are collected by the 
mirror (figure 26) at one point, namely its focus. The rays originating 
from some other point of the heavenly body, being not exactly parallel 
to the axis of the mirror, are collected almost at one point in the neigh- 
borhood of the focus. Thus, in the so-called focal plane through the 
focus of the mirror and perpendicular to its axis, the inverse image of 
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the star is obtained; the farther away this image is from the focus, the 
more diffuse it will be, since it is only the rays exactly parallel to the 
axis of the mirror that are collected by the mirror at one point. The 
image so obtained can be viewed in a spécial microscope, the so-called 




Fig. 27. 


eye piece of the telescope, either directly or, in order not to eut off the 
light from the star with one’s own head, after reflection in a small plane 
mirror, attached to the telescope near the focus (somewhat nearer than 
the focus to the concave mirror) at an angle of 45°. 

The searchlight (figure 27) is based on the same property of the parabola. 
In it, conversely, a strong source of light is placed at the focus of a 
paraboloidal mirror, so that its rays are reflected from the mirror in a 
beam parallel to its axis. Automobile headlights are similarly constructed 
(figure 28). 




Fig. 28. 


Fig. 29. 
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In the case of an ellipse, as it is easy to show, the rays issuing from 
one of its foci and reflected by the ellipse are collected at the other 
focus F 2 (figure 29), and in the hyperbola the rays originating from one 
of its foci Fj are reflected by it as if they originated from the other focus 
F 2 (figure 30). 



Fig. 30. 


The directrices of the ellipse and the hyperbola. Like the parabola, 
the ellipse and the hyperbola hâve directrices, in this case two apiece. 
If we consider a focus and the directrix “on the same side with it,” then 
for ail points M of the ellipse we hâve p/d = «, where the constant « is 
the eccentricity, which for an ellipse is always smaller than 1; and for 
ail points of the corresponding branch of the hyperbola, we also hâve 
p/d = f, where e is again the eccentricity, which for a hyperbola is always 
greater than 1. 

Thus the ellipse, the parabola and one branch of the hyperbola are 
the loci of ail those points in the plane for which the ratio of their distance 
p from the focus to their distance tffrom the directrix is constant (figures 31 
and 32). For the ellipse this constant is smaller than unity, for the parabola 
it is equal to unity, and for the hyperbola it is greater than unity. In 




Fig. 31. 


Fig. 32. 
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this sense the parabola is the “limiting” or “transition” case from the 
ellipse to the hyperbola. 

Conic sections. The ancient 
Greeks had already investigated 
in detail the curves obtained by 
intersecting a straight circular 
cône by a plane. If the intersecting 
plane makes with the axis of the 
cône an angle 4> of 90°, i.e., is 
perpendicular to it, then the sec¬ 
tion obtained is a circle. It is easy 
to show that if the angle <f> is 
smaller than 90°. but greater than 
the angle a which the generators 
of the cône make with its axis, 
then an ellipse is obtained. If <f> is 
equal to a, a parabola results and 
if <f> is smaller than a, then we 
obtain a hyperbola as the section (figure 33). 

The parabola as the graph of quadratic proportion and the hyperbola 
as the graph of inverse proportion. We recall that the graph of quadratic 
proportion 

y = kx 2 

is a parabola (figure 34) and that the graph of 
inverse proportion 

k . 
y = - or xy = k 

is a hyperbola (figure 35), as we will easily 
prove later. A hyperbola was defined earlier as 
the curve represented by the équation 

Fig. 34. 

i 

a 2 b 2 

ln the spécial case a = b the so-called reciangular hyperbola plays the 
same rôle among hyperbolas as the circle plays among ellipses, ln this 
case, if we rotate the coordinate axes by 45° (figure 36) the équation in 
the new coordinates (x', ÿ) will hâve the form 

x'ÿ = k. 
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We hâve now considered three important second-order curves: the 
ellipse, the hyperbola, and the parabola, and for their définitions we hâve 
taken the so-called canonical équations 

x- V 2 x 2 v 2 

^ + and 



We now pass to the study of the general second-degree équation in two 
unknowns, namely to the question what kinds of curves are represented 
by this équation. 


§8. The Réduction of the General Second-Degree Equation to 
Canonical Form 

The first consistent présentation of analytic geometry by Euler. A 

significant step in the development of analytic geometry was the appearence 
in 1748 of the book “Introduction to analysis” in the second volume 
of which, among other things related to the theory of functions and 
other branches of analysis, for the first time a présentation was given 
of analytic geometry in the plane with a detailed investigation of second- 
order curves, very close to the one given in contemporary textbooks of 
analytic geometry, and also with an investigation of higher order curves. 
This was the first exposition of analytic geometry in the contemporary 
sense of the word. 
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The notion of reducing an équation to canonical form. A second-degree 
équation* 

Ax 2 + 2 Bxy + Cf + 2 Dx + 2Ey + F = 0 

contains six terms, not three or only two as in the above canonical 
équations of the ellipse, hyperbola, and parabola. This is not because 
such an équation represents a more complicated curve but because the 
System of coordinates is possibly not suited to it. lt turns out that if we 
select a suitable Cartesian coordinate System, then a second-degree 
équation with two variables always can be reduced to one of the following 


canonical forms: 



X 2 y 2 

Lj+S-l-ft 

cz> 

Ellipse 

a 4 o 

X ^ v* 

2 -^ + £ + 1 = °- 



c ^ 

lmaginary ellipse 

x 2 V 2 

3 -^ + f = °- 


Point (a pair of imaginary Unes 
intersecting in a real point) 

© 

II 

1 

•kl* 

1 

Tt 

> c 

Hyperbola 

O 

II 

•kl* 

1 


A pair of intersecting Unes 

6. y 2 - i P x = o. 

c 

Parabola 

7. x 2 -a 2 = 0. 

! 1 

A pair of parallel Unes 

8. x 2 +a 2 = 0. 

: t 

• i 

A pair of imaginary parallel Unes 

9. x 2 = 0. 

1 

A pair of coincident straight Unes 


where a, i, p , are not equal to zéro. 

Equations 1, 4, and 6 of the enumerated canonical forms are already 
well known to us ; these are the canonical équations of the ellipse, hyper¬ 
bola, and parabola. Two of them are not satisfied by any points, namely 
équations 2 and 8. lndeed, the square of a real number is always positive 
or zéro, so that on the left-hand side of équation 2 the sum ofthe terms 
x 2 /a 2 + ÿ*lb 2 is never négative, and since the term 4- 1 also appears, the 


* The coefficients of xy , x, y will be denoted not by B, D, E but by 2 B, 20, 2£ for 
simplicity of subséquent formulas. 
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left-hand side cannot be equal to zéro; analogously in équation 8, the 
number x 2 is not négative, and a 2 is positive. From these considérations, 
it follows that only (x = 0, y = 0) satisfies équation 3, i.e., one point, 
the origin. Equation 5 can be written as (x/a — y/b)(x/a + y/b) = 0, 
from which we see that it is satisfied by those points and only those 
points for which one of the first-degree expressions x/a — y/b or x/a + y/b 
is equal to zéro; so the curve it represents is this pair of intersecting lines. 
Equation 7 analogously gives (* — a)(x + a) = 0; i.e., the corresponding 
curve is a pair of parallel lines x = a and x = — a. Finally, curve 9 is 
a spécial limiting case of curve 7, when a = 0; i.e., it is a pair of coincident 
lines. 


Formulas of coordinate transformations. In order to obtain the in- 
dicated important resuit about the possible types of second-order curves, 
it is necessary first to deduce the formulas by which the rectangular 
coordinates of points vary under a change of the coordinate System. 

Let x, y be the coordinates of a point M relative to the axes Oxy. 
Let us translate these axes parallel to themselves to the position O'x'ÿ 
and let the coordinates of the new origin O’ relative to the old axes 


/ 

y, 

x' 


X 

(?' 

y 


i 

y 

7 

V 

0 


X 


Fig. 37 



be £ and rj. It is évident (figure 37) that the new coordinates x', ÿ of 
the point M are connected with its old coordinates x, y by the formulas 

x = x- + Ç, 
y = y' + rj, 

which are the formulas of the so-called parallel translation of axes. If 
we rotate the original axes Oxy about the origin counterclockwise by an 
angle </> then, as is easy to see (figure 38), if we project the polygonal 
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line OA'M composed of the new coordinate segments x',ÿ on the 
Ox-axis and the Oy-axis, respectively, we obtain 

x = x' cos <f> — y sin <f>, 
y = x' sin <t> + ÿ cos <f>, 

which are the formulas for transformation of coordinates under rotation 
of a rectangular coordinate System. 

If we are given an équation F(x, y) = 0 of a curve relative to the 
axes Oxy and we wish to write the transformed équation of the same 
curve, i.e., relative to the new axes O'x'y', then we must replace x and y 
in the équation F(x , y) = 0 by their expressions in terms of x' and y', 
given by the formulas of the transformation. For example, under parallel 
translation of the axes, we obtain the transformed équation 

F(x' + £./ + *?) = 0, 

and under rotation of the axes the équation 

F(x' cos <f> — ÿ sin <f>, x' sin <f> 4- ÿ cos <f>) = 0. 

We note that under a transformation to new axes the degree of an 
équation does not change. Indeed, the degree cannot increase, since the 
transformation formulas are of the first-degree. But the degree cannot 
decrease either, since then the inverse coordinate transformation would 
increase it (and it is also of the first degree). 

The rednction of a general second-degree équation to one of the 9 cano- 
nical forms. We now show that given any second-degree équation in 
two unknowns we can always first rotate the axes and then translate 
them parallel to themselves in such a way that the transformed équation 
for the final axes will hâve one of the forms 1, 2, —, 9. 

Indeed, let the given second-degree équation hâve the form 

Ax 2 + 2 Bxy + Cf + 2 Dx + 2Ey + F = 0. (8) 

Let us rotate the axes through some angle <f >, which we select in the 
following way. Replacing x and y in équation (8) by their expressions 
in terms of the new coordinates (according to the formulas for rotation), 
we find, after collecting similar terms, that the coefficient 2 B' in the 
transformed équation 

A'x' 2 + 2 B'x'ÿ + C'y' 2 4- 2D'x' + 2 E'ÿ + F’ = 0 
is equal to 

2 B' = —2 A sin <f> cos <f> + 2B(cos 2 ^ — sin 2 ^) + 2C sin<f> cos<f> 

= 2 B cos 2 <f> — ( A — C) sin 2 tf>. 
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Setting it equal to zéro, we obtain 2 B cos 2 <f> = (A — C) sin 2 <f>, from 
which 

cot 2<f> = — 2 ^r~ • 

Since the cotangent varies from —oo to + oo, we can always find an 
angle <f> for which this equality is satisfied. By rotating the axes through 
this angle, we find that for the rotated axes Ox'ÿ the équation of our 
curve, represented for the initial axes by équation (8), has the form 

A'x' 2 + C'y' 2 + 2 D'x' + 2 E'ÿ + F = 0, (9) 

i.e., that it does not contain the term with the product of the coordinates 
(F remains unchanged, since the formulas of rotation do not contain 
constant terms). 

Now we translate the already rotated axes Ox'ÿ parallel to themselves 
to the position 0"x"y", and let the coordinates of the new origin O" 
relative to the axes Ox'ÿ be f', ÿ. The équation of our curve for these 
final axes will be 

A'(x" + f )« + C'(ÿ + r,') 2 + 2D'(x" + f ) + 2 E'(ÿ + v ') + F = 0. 

(10) 

We now show that we can always select Ç' and rf (i.e., we can translate 
the axes Ox'ÿ parallel to themselves) in such a way that the final équation 
for the axes 0"x"y" has one of the canonical forms 1, 2, •••, 9. 

Removing ail parenthèses in équation (10) and collecting similar 
terms, we obtain 

A'x’ 2 + C'y 2 + 2 (AT + D')x’ + 2 (Cy + E‘)ÿ + F' = 0, 

(10') 

where we hâve denoted by F' the sum of ail constant terms; its value 
does not interest us at the moment. 

We consider three possible cases. 

I. A' and C' both not equal to zéro. In this case, taking£' = — D'/A', 
ÿ = — E'IC', we annihilate the terms with the first powers of x" and y" 
and obtain an équation of the form 

A'x' 2 + C'y 2 + F = 0. (1) 

II. A' zÿ 0, C' = 0, but E' ÿz 0. Letting Ç = — D'\A\ ÿ = 0, i.e., 
y" = ÿ, we obtain the équation 

A'x" 2 + 2 E'ÿ + F' = 0, 
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or 

A-x-* + 2 E' (ÿ +-~)=0. 


Then making a parallel translation along the Oy'-axis by an amount 
ri" = — F/2£', we find that y' = y" — F/2E', i.e., y 4- F/2E' = y" so 
that we obtain the équation 

A'x ’ 2 + 2 E'y" = 0. (11) 

lf we hâve A' = 0, C' ^ 0, D' ^ 0, we can simply interchange the 
rôles of x and y and obtain the same resuit. 

III. A' 0, C' = 0, E' = 0. Taking again f ' = — D'/A', -q' = 0, we 
obtain the équation 

A'x' 1 + F = 0. (111) 

If we hâve A' = 0, C' ^ 0, D' = 0, we can again interchange the rôles 
of x and y. 

We hâve now considered ail the possibilities, in view of the fact that 
A' and C' cannot simultaneously be zéro, since then the degree of the 
équation would be reduced, and we hâve seen that under our coordinate 
transformations this degree does not change. 

Thus, with the appropriate choice of rectangular coordinates every 
second-degree équation can be brought to one of the three so-called 
“reduced” équations ( 1 ), (II), ( 111 ). 

Let the équation hâve the form (1) (in this case A' and C' are not 
equal to zéro). If F 9 ^ 0, then writing équation (1) as 

x" 2 v' 2 

— -+ - 1 = 0 , 

-F'/A' -F/C 

we arrive, depending on the signs of A', C\ F', at one of the équations 
1, 2, or 4. If the denominator of x" 2 is négative and that of y" 2 is positive, 
then we must also interchange the axes 0"x" and O’y’. 

lf F' = 0, then équation (I) can be written in the form 


x ‘ v 2 

T 7 * J " 17c 7 = °’ 


and we arrive at équations 3 or 5. 

lf the équation has the form (11) (in this case A' and E' are not both 
zéro), then we can write it as 

2 E' 
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and denoting — E'/A' by p and interchanging the names of the axes 
0"x" and O’y" we obtain équation 6. 

Finally, if we hâve an équation of form (III) (where A' 0), it can be 
rewritten as x’ 2 4- F'/A' = 0 and one of the équations 7, 8, or 9 is 
obtained. 

This important theorem on the possibility of reducing every 2nd- 
degree équation to one of the 9 canonical forms was already examined 
in detail by Euler. The arguments in Euler’s book differ only in form 
from the ones just given. 

§9. The Représentation of Forces, Velocities, and Accélérations by 
Triples of Numbers; Theory of Vectors 

Following Euler an important step was taken by Lagrange. In his 
“Analytic mechanics," published in 1788, Lagrange arithmetized forces, 
velocities and accélérations in the same way as Descartes arithmetized 
points. This idea that Lagrange developed in his book subsequently 
took the form of the so-called theory of vectors and proved to be an 
important help in physics, mechanics, and technology. 

Rectangular coordinates in space. We remark, first of ail, that 
neither Descartes nor Newton developed analytic geometry in space. 
This was done later on, in the first half of the 18th century, by Laguerre 
and Clairaut. In order to specify a point M in space they selected three 
mutually perpendicular axes Ox, Oy. and Oz and considered (figure 39) 
the numerical values of the distances of the point M from the planes 
Oyz, Oxz, and Oxy, taken with the corresponding signs, the so-called 
abscissa x, ordinale y, and altitude z of the point M. 



Fig. 39. 


Fig. 40. 
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Arithmetization of forces, velocities, and accélérations, introduced by 
Lagrange. We consider (figure 40) a force / which can be represented 
in conventional units by a segment with an arrow, having a spécifie 
length and direction. Lagrange points out that this force / can be de- 
composed into three components f t , /, , and f t in the direction of the 
corresponding axes Ox, Oy, and Oz\ these components, as directed 
segments on the axes, can be given simply by numbers, positive or négative 
depending on whether the component is directed in the positive or the 
opposite direction of the axis. Thus, we can consider, for example, the 
force (2, 3, 4) or the force (1, —2, 5), etc. In the composition of forces 
according to the parallelogram law, as can easily be shown (it will be 
shown later), their corresponding components hâve to be added. For 
example, the sum of the given forces is the force 

(2 + 1.3 —2,4 + 5) = (3, 1,9). 

The same can be done for velocities and accélérations. In every problem 
of mechanics, ail the équations connecting forces, velocities, and accéléra¬ 
tions can also be written as équations connecting their components, i.e., 
connecting simply numbers; then the mechanical équation will necessarily 
be written in the form of three équations; first for the x's, the second 
for the y’s, and the third for the z’s. 

But it was only after a hundred years from the time of Lagrange that 
mathematicians and physicists, particularly under the influence of the 
developing theory of electricity, began on a wide scale to consider the 
general theory of such segments, having a definite length and direction. 
Such segments were called vectors. 

The theory of vectors has a great significance in mechanics, physics, 
and technology, and its algebraic side, the so-called algebra of vectors 
(in contrast to vector analysis) appears at once as an essential constituent 
part of analytic geometry. 

Algebra of vectors. Any directed segment (whether it represents a 
force, a velocity, an accélération, or some other entity) i.e., a segment 
having a given length and a definite direction, is called a vector. Two 
vectors are said to be equal, if they hâve the same length and the same 
direction; i.e., in the very concept of “vector" only its length and its 
direction are taken into account. Vectors can be added. Let the vectors 
a, b, — , d be given. We lay out the vector a from some point, then from 
its end point we draw the vector b, etc. We obtain a so-called vector 
polygon ab — d (figure 41). The vector m whose initial point coincides 
with the initial point of the first vector a of this polygon, and whose 
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end point coincides with the end point of the last vector d, is called the 
sum of these vectors 

m = a + b + •" 4- d. (11) 

It is easy to show that the vector m does not dépend on the order in 
which the summands a, b, •••, d are taken. 


d 



Fig. 41. 



Fig. 42. 


The vector equal in length to the vector a but opposite in direction 
is called its inverse vector and is denoted by — a. 

Subtraction of the vector a is defined as addition of its inverse vector. 

In vector calculus ordinary real numbers are customarily called scalars. 
Let a vector a (figure 42) and a scalar A be given, then by the product 
of the vector a with the scalar (number) A, i.e., Aa, is meant the vector 
whose length is equal to the product of the length |a| of the vector a 



Fig. 43. 


Fig. 44. 
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and the absolute value | A | of the number À, and whose direction is the 
same as that of a if À > 0 and the opposite if À < 0. 

Let us consider a System of rectangular Cartesian coordinates Oxyz 
and the vectors e x , e 2 , e 3 having length equal to unity and directions 
coinciding with the positive directions of the axes Ox, Oy, Oz, respectively. 
It is obvious that any given point M (figure 43) of space can be reached 
from the origin O by traversing a certain number of “times” (an intégral, 
fractional or irrational, positive or négative “number of times”) the 
vector e, , chen so many “times” the vector e 2 , and finally so many 
“times” the vector e 3 . It is clear that the numbers x, y, z showing how 
many “times” it is necessary to traverse the vectors e,, e 2 , e 3 , are simply 
the Cartesian coordinates of the point M. 

Let a vector a be given; if we cause a point to move from the initial 
point of a to its end point and décomposé this motion into motions 
parallel to the axes Ox, Oy, and Oz, and if it is hereby necessary to shift 
the point through a distance xe, parallel to the Ox-axis, through ye 2 
parallel to the O^-axis and through ze 3 parallel to the Oz-axis, then 


a = xe, 



Fig. 45. 


its coordinates are multiplied by 


ye 2 + ze 3 . (12) 

The numbers x, y, z are called the 
coordinates of the vector a. These are 
obviously just the coordinates of the 
end point M of this vector, if its 
initial point lies at the origin O of 
the coordinate System (figure 44). 
From this it follows at once that in 
adding vectors their corresponding 
coordinates are to be added, and in 
subtraction they are to be subtracted. 
If the first vector “carnes" us along 
the Ox-axis by a distance xe x , and 
the second by x'e,, then clearly their 
sum “carries” us along the Ox-axis by 
a distance (x -F x'fo , etc. (figure 45). 
It also follows at once that in mul¬ 
tiplication of a vector by a number, 
î number. 


Scalar product and its properties. If we are given two vectors a and 
b, then the number equal to the product of their lengths by the cosine 
of the angle between them | a | | b | cos<£ is called their scalar product and 
is denoted by ab or (ab). Let x, y, z be the coordinates of the vector a 
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and x, ÿ, z the coordinates of the vector b; then the scalar product is 
equal to 

ab = xx + yÿ + zz, ( 13) 


i.e., to the sum of the products of their corresponding coordinates. 

This important resuit can be proved as follows. First we make the 
following remarks: 

(1) If we multiply one of the vectors of a scalar product, for example a, 
by a number A, this is obviously the same as multiplying their scalar 
product by the same number, i.e., 

(Aa)b = A(ab). 

(2) The scalar product is distributive, i.e., if 
a — a, + a 2 , then ab = a,b -F a,b. 

In fact, the left-hand side of this equality is 
equal to the product of the length of the vector b 
by the numerical value of the projection of the 
vector a on the axis of the vector b (figure 46), 
and the right-hand side is equal to the product of 
the length of b by the sum of the numerical values 
of the projections of the vectors a! and a 2 on the axis of b. But 
proj a = proj a, + proj a 2 , which proves the equality. 

Now we consider two vectors a and b whose décompositions in terms of 
the vectors e!, e 2 , e 3 are a = Are, + ye 2 + re 3 , b = xe! + ÿe 2 + ** 3 . 
so that 

ab = (xe, + ye 2 + ze 3 X*e, + ÿe 2 + ze 3 ). 



By the distributivity (2) of the scalar product, the sums of vectors in 
parenthèses can be multiplied as polynomials, and by (1) the scalar 
factors in each of the terms can be taken outside the parenthèses, so that 

ab = xxe t e, + Arye,e 2 + xze,e 3 + yxe&i + yÿe^t + y**4h 

+ zxtfi , + zÿe 3 e 2 + zze 3 e 3 . 

But 

| e, | = | e 2 1 = I e 3 1 = 1, cos 0 = 1 and cos 90° = 0. 
Consequently, 

= 1, e,e 2 = 0, eje 3 = 0, 

= 0 , = 1 , t& 3 = 0 , 

e 3 ej = 0, eje, = 0, e 3 e 3 = 1. 


Th us, 


ab = xx + yÿ + zz. 


(14) 
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We remark, in particular, that if the vectors a and b are mutually 
perpendicular, then <j> = 90° and cos <f> = 0. Therefore the equality 

xx + yÿ + zz = 0 (15) 

serves as an easily vérifiable condition of perpendicularity of the vectors 
a and b. 

Angle between two directions. Let us consider a direction charac- 
terized by its angles a, y with the coordinate axes. We draw the line 




in this direction through the origin of the coordinate System and mark 
off on it from the origin a segment OA of unit length (figure 47). In this 
case the coordinates of the point A , i.e., the coOrdinates of the vector OA 
are exactly cos a, cos fi, and cos y. If we hâve a second direction given 
by the angles à, fi, ÿ, then the analogous vector OB for this second direc¬ 
tion has coordinates cos â, cos fi, cos ÿ (figure 48). Let <f> be the angle 
between these vectors; then their scalar product is equal to I-lcos^ 
from which we find 

cos <f> = cos a cos â + cos fi cos fi -f cos y cos ÿ. (16) 

This is the very important formula for the cosine of the angle between 
two directions. 


§10. Analytic Geometry in Space; Equations of a Surface in Space and 
Equations of a Curve 

If an équation z = f(x, y) is given and if x and y are regarded as the 
abscissa and ordinate and z the altitude of a point, then this équation 
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itself represents some surface P , which can be obtained by erecting 
perpendiculars of length z at the points (x, y) of the Oxy-plane. The 
locus of the end points of these perpendiculars gives the surface P re- 
presented by this équation. If the équation connecting x , y, and z is not 
already solved with respect to z, then it can be solved for z and after 
that we can construct the surface P. In general, in analytic geometry the 
totality of ail those points of space whose coordinates x, y, z satisfy a 
given équation (figure 49) in three variables x, y, z is said to be the surface 
represented by the équation. 



A function of two variables f(x, y), as was pointed out already in 
Chapter II, can represent not only a surface P, but also its System of 
level curves, i.e., curves in the Oxy-plane on each of which the function 
/( x, y) has a constant value. This System of curves is clearly nothing 
else than the topographical map of the surface P on the Oxy- plane. 

Example. The équation xy = z gives, for instance, the level curves: 
— , xy = —3, xy =-- —2, xy = —1, xy = 0, xy = 1 ; xy = 2, xy = 3, •••. 
Ail of them are hyperbolas (figure 50) except xy = 0, which represents 
the two coordinate axes. What is obtained is clearly a saddlelike surface 
(figure 51) (the so-called hyperbolic paraboloid). 

In order to define a curve in space, we can give the équations of any 
two surfaces P and Q which intersect along the curve. For example, the 
System 

xy = z, 
x 2 + y 2 = 1 


gives a space curve (figure 52). The équation xy = z détermines the 
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earlier hyperbolic paraboloid, and the équation x 2 + y 2 = 1 détermines 
a circular cylinder of unit radius, whose axis is the Oz-axis. The System 
of équations consequently defines the curve of intersection of the para¬ 
boloid with the cylinder, which is represented in figure 52. 



Fia. 51. Fig. 52. 


If in this System one of the unknowns, say x, is chosen arbitrarily, 
and then the System is solved with respect to y and z, we will obtain 
the coordinates x, y, z of the various points of the curve. 

Equation of a plane and équations of a straight line, lt can be shown 
that every équation of the first degree with three variables 

Ax + By + Cz + D = 0 

represents a plane, and conversely. By what has already been said, it is 
clear that a line can be given by a System of two such équations: 

A y x + B x y -(- C,z + D x = 0, 

A 2 x + Bjy + C 2 z 4- D 2 = 0, 

i.e., as the curve of intersection of two planes. 

The general second-degree équation in three variables and its 17 canonical 
forms. A second-degree équation in three variables 

A x x 2 + Az}’ 2 + A 3 z 2 + 2 B x yz + 2 B 2 xz -(- 2 B a xy 

+ 2 C x x + 2O + 2C 3 z + D = 0, (17) 
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contains 10 terms. Analogously to what was done earlier for an équation 
with two variables, it can be shown that by a suitable rotation of the 
given coordinate System about the origin, équation (17) can be reduced 
to the form 


A[x' 2 + A'J 2 + A: 3 z' 2 + 2 C[x' + 2 CW + IC'f + D = 0, (18) 


i.e., so as to eliminate the terms with products of the variables. However, 
the proof here of the possibility of such a simplification of the équation 
is considerably more difficult than in the case of the plane. The difficulty 
of the proof arises from the fact that in the plane a rotation about a 
point is given by one angle <{>, which we selected suitably, while in space 
the rotation of a body about a fixed point is given by three independent 
angles (Euler angles) <f>, 9, <p and in a quite complicated way. So the 
équation must be cleared of the terms with products of variables in a 
roundabout way (see Chapter XVI on the theory of réduction by orthogon¬ 
al transformations of a quadratic form to a sum of squares). Then, as 
in the case of the plane, a parallel translation of the axes is made and 
the équation is simplified, after which équation (18) finally assumes one 
of the following canonical forms: 


'■5+5+5-'- 


2 -^ + ^ + 3 + '-o 

a 2 b 2 c 2 


3.4 + 


b 2 


1 = 0 


X 2 V 2 Z 2 

4 .^ + £- 7 + l =° 


5. — + ^- — = 0 
à 2 b 2 c 1 


y2 V 2 Z 2 

6 | + è + ^= 0 


7 -? + è- te -o 


O 





% 

% 

& 


Ellipsoid 

Imaginary ellipsoid 
Hyperboloid of one sheet 
Hyperboloid of two sheets 

Second-order cône 


Imaginary second-order cône 

Elliptic paraboloid 
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X 2 V 2 

8. =- — f- - 2cz = 0 
à 2 b 2 

c* 

Hyperbolic paraboloid 

AT 2 y 2 

9 ' V 2 + T 2 ~ 1 = 0 

(5 

Elliptic cylinder 


a 

lmaginary elliptic cylinder 

"■ 5 +£-° 

a 2 b 2 

N 

A pairofintersecting 
imaginary planes 

12* £-1=0 
a 2 b 2 

01 

Hyperbolic cylinder 

l3.|-4 = ° 

i 

A pairofintersecting planes 

14. y 2 — 2 px = 0 

a 

Parabolic cylinder 

15. x 2 — a 2 = 0 

m 

A pair of parallel planes 

16. x 4 +a 2 = 0 

17. x 4 = 0 

m 

f) 

A pair of imaginary parallel 
planes 

A pair of coincident planes. 


The last nine canonical équations 9-17 do not contain terms in z and 
represent exactly the canonical équations of second-order curves in the 
Oxy- plane. In space these équations represent cylinders, whose directrices 
are the corresponding second-order curves in the Oxy -plane and whose 
generators are parallel to the Oz-axis. Indeed, if one of these équations 
is satisfied by a point with coordinates (x,, y,, 0), then it will also be 
satisfied by any point with coordinates (x, ,y lt z) whatever z may be, 
since there are in any case no terms with z in the équation. 

Among the équations 1-8 as can easily be seen, équation 2 is not 
satisfied by any point with real x, y , z and équation 6 is satisfied only 
by one such point (0, 0, 0), i.e., the origin. It remains, therefore, to study 
only the six équations 1, 3, 4, 5, 7, 8. 

Ellipsoid. Let us compare the surfaces represented by the équations 
x 2 /a 2 + y 2 /b 2 + z 2 /c 2 —1=0 and x 2 + y 2 + z 2 — 1 =0. The second 
of these is obviously the équation of a sphere C with center at the origin 
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and with unit radius, since x 2 + y 2 + z 2 is the square of the distance 
from the point (x, y, z) to the 
origin O. If (x,y, z ) is a point 
lying on the sphere, i.e., satisfy- 
ing the second équation, then 
( ax, by, cz) is a point whose 
coordinates satisfy the first 
équation. The surface repres- 
ented by the first équation is 
thus obtained from the sphere 
C if ail abscissas x of points of 
the sphere are replaced by ax. Fia. 53. 

y by by, and z by cz, i.e., if the 

sphere C is uniformly stretched from the Oyz-, Oxz-, and Oxy-planes with 
coefficients of expansion a, b and c, respectively. This surface is called 
an ellipsoid (figure 53). 



Hyperboloids and the second-order cône. Let us consider équations 
3, 4, and 5, i.e., the équation of the form 


a 2 + b 2 c 2 


where 8=1, —1 or 0. Let us compare it with the équation 


— + il — — = 8, 

a 2 a 2 c 2 


in which the denominator of y 2 is also a 2 and not b 2 , as in équation (19). 
As before, we observe that surface (19) is obtained from surface (20) by 
expansion from the Oxz-plane with coefficient b/a. 

Let us now see what surface is represented by (20). We take a plane 
z — h perpendicular to the Oz-axis and examine its intersection with the 
surface (20). Substituting z = h in équation (20), we obtain 


**+/ = «*( S+-£-). 


If 8 + h 2 /c 2 is positive, then this équation together with z = h gives 
a circle, lying in the plane z = h with center on the Oz-axis. If 8 + h 2 /c 2 
is négative, which can be the case only with 8 = —1 and h 2 small, then 
the plane z = h does not intersect surface (20) at ail, since the sum of 
squares x 2 -f -y 2 cannot be a négative number. 


224 


III. ANALYTIC GEOMETRY 


The whole surface (20) thus consists of circles lying in planes perpen- 
dicular to the Oz-axis and having their centers on the Oz-axis. But in 
this case the surface (20) is a surface of révolution about the Oz-axis. 



FiG. 54. FiG. 55. 


If we intersect it with a plane passing through the Oz-axis, we obtain 
its “meridian,” i.e., a curve, lying in a plane passing through the axis, 
by the révolution of which the surface is generated. 

If we intersect the surface (20) with the coordinate plane Oxz, i.e., 
the plane y = 0 (figure 54), by substituting y = 0 in équation (20), we 
obtain the équation of the meridian x*/®*— z 2 /c 2 = S. Fn case 8 = 1 



Fig. 56. 


Fig. 57. 
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this is the hyperbola /, for 8 = —1, it is the hyperbola //, and for 8 = 0, 
the pair of intersecting Unes ///. By révolution around the z-axis these 
produce, respectively, a so-called hyperboloid of révolution of one sheet 
(figure 55), a hyperboloid of révolution of two sheets (figure 56) and a 
straight circular cône (figure 57). 

The general hyperboloid of one sheet, hyperboloid of two sheets, and 
second-order cône 3, 4, and 5 are obtained from these surfaces of révolution 
by an expansion from the Oxz-plane with coefficient b/a. 


Paraboloids. Only équations 7 and 8 remain. Let us compare the 
first of these x^/a 2 + y 2 /b 2 = 2 cz with the équation 


* + t- = 2cz 
a 2 + a 2 ' 


which we investigate in the same way as before. It represents a surface 
obtained by revolving the parabola x 2 = 2a 2 cz about the Oz-axis, 




Fig. 59. 


namely the so-called paraboloid of révolution (figure 58) discussed earlier 
in connection with parabolic mirrors. The general elliptic paraboloid 7 
is obtained from the paraboloid of révolution by an expansion from the 
Oxz-plane. 

The surface 8 has to be studied in a different way, namely by examining 
its intersections with planes z = h, which are hyperbolas. The contour map 
of the surface 8 is represented in figure 50; in a different position of 
the coordinate axes we considered this surface in figure 51. It is saddle- 
shaped, as illustrated in figure 59 and is called a hyperbolic paraboloid. 
Its intersections with planes parallel to the Oxz-plane turn out to be 
identical parabolas. The same resuit is obtained by intersections with 
planes parallel to the Oyz- plane. 
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Rectilinear generators of a hyperboloid of one sheet. It is a very 
curious and not at ail obvious fact that the hyperboloid of one sheet 
and the hyperbolic paraboloid can be obtained, just like the cône and 
the cylinder, by the motion of a straight line. In case of the hyperboloid, 
it is sufficient to prove this fact for a hyperboloid of révolution of one 
sheet x?/a 2 + y 2 /b 2 — z*/c* = 1, since the general hyperboloid of one 
sheet is obtained by a uniform expansion from the Oxz-plane and under 
such an expansion any straight line will go into a straight line. Let us 







intersect the hyperboloid of révolution with the plane y = a parallel to 
the Oxz-plane. Substituting y = a we obtain 


x 2 a 2 z 2 . x 2 z 2 

a 2 + a 2 c* or a* c 2 


0 . 


But this équation together with y = a gives in the plane y = a a pair 
of intersecting Unes: x/a — z/c = 0 and x/a + z/c = 0. 

Thus we hâve already discovered that there is a pair of intersecting 
lines lying on the hyperboloid. If now we revolve the hyperboloid about 
the Oz-axis, then each of these lines obviously traces out the entire 
hyperboloid (figure 60). 

It is easy to show that: (1) two arbitrary straight lines of one and the 
same family of lines so obtained do not lie in the same plane (i.e., they 
are skew lines), (2) any line of one of these families intersects ail the 
lines of the other family (except its opposite, which is parallel to it), 
and (3) three lines of one and the same family are not parallel to any 
one and the same plane. 
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With two matches and a needle it is easy to obtain a représentation of 
the hyperboloid of révolution of one sheet. Let us puncture one of the 
matches through its middle by the needle, and on the sharp end point 
of the needle we pin the other match parallel to the first match. If we 
then revolve the whole apparatus about the first match as an axis, the 
second match will trace the surface of a cylinder (figure 61). But if the 
second match is not parallel to the first match, then during a révolution 
it will trace the surface of a hyperboloid of révolution of one sheet, as 
can easily be visualized if the rotation is rapid (figure 62). 

Snmmary of the investigation of the second-degree équation. Although 
the general second-degree équation with three variables can represent 
essentially 17 different surfaces, it is not difficult to remember them, 
The last nine are cylinders over the nine possible second-order curves, 
while the first eight are divided into four pairs: two ellipsoids (real and 
imaginary), two hyperboloids (of one sheet and two sheets), two second- 
order cônes (real and imaginary), and two paraboloids (elliptic and hyper- 
bolic). Ail these surfaces play an essential rôle in mechanics, physics, 
and technology (ellipsoid of inertia, ellipsoid of elasticity, hyperboloid 
in the Lorentz transformation in physics, paraboloid of révolution for 
parabolic mirrors, etc.). 


§11. Affine and Orthogonal Transformations 

The next important step in the development of analytic geometry was 
the introduction into it, and into geometry in general, of the theory of 
transformations. Here it will be necessary to explain the matter in some 
detail. 


“Contraction” of the plane toward a line. 

simplest transformations of the plane, 
namely uniform “contraction" toward a 
line with coefficient k. In the plane let 
there be given a line a and a positive 
coefficient k, for example, k = 2/3. Ail 
points of the line a are fixed, and every - 
point M not lyingon this line is sent into 
the point M 'such that M 'lies on the same 
side of the line as M on the perpendicular 
from M to a at a distance from a equal 
to 2/3 of the distance from M to a. If the 
coefficient k, as here, is smaller than unity, 
then we hâve a proper contraction of the 


Let us consider one of the 


oM 



i N' 

t 

N 


Fig. 63. 
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plane to the line; but if A: is greater than unity, we hâve an expansion 
of the plane from the line, but for convenience we will in this and 
other cases talk about “contraction,” except that the word “contraction” 
will be put in quotation marks. 

The point or figure to be transformed is called the preimage and the 
one into which it is sent is its image. The point M ', for example, is the 
image of the point M (figure 63). 

We show that under a uniform “contraction” of a plane to a line, 
any line of the plane is transformed into a line. For let the plane be 
“contracted” to a line a lying in it with coefficient of “contraction” k. 
Let b be any line of the plane, O the point in which it intersects the line a , 
B another arbitrary point of b, and BA the perpendicular to the line a 
from the point B (figure 64). [n the “contraction” the point B goes to 




the point B' on this perpendicular such that B'A = k ■ BA. Therefore, 
the tangent of the angle B OA will be equal to AB'/OA = k ■ AB/OA, 
i.e., will be equal to k times the tangent of the angle which the line b 
makes with line a, i.e., for ail points B' into which different points of 
the line b are transformed, it will be one and the same. Ail points B' 
consequently lie on one and the same line, passing through the point O 
and making with line a an angle with this tangent. 

Under “contraction” parallel fines remain parallel. Indeed, if the 
tangents of the angles which fines b and c make with fine a are the same, 
then the tangents of those angles which the images b' and c' make with a 
differ from them only by a factor k, i.e., they are still equal to each other, 
which means that the fines b' and c’ are also parallel to each other. 

Any rectilinear segment of the plane under “contraction” to a fine is 
contracted (or expanded) uniformly (although to various degrees for 
segments of various directions). When we speak here of “uniform” 
contraction, we mean that the midpoint of the segment remains the 
midpoint, the third remains the third, etc., i.e,, the segment shrinks 
uniformly along its full length. Indeed, in whatever ratio the point M 
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divides the segment M l M 2 , its image M' will divide M[M 2 , in the same 
ratio, since parallel Unes (in this case perpendiculars to the line a) eut 
lines intersecting them (in this case b and b') in proportional parts 
(figure 65). 

The ellipse as the resuit of “contraction” of a circle. We consider a 
circle with center at the origin and radius a. By the theorem of Pythagoras 
its équation is x 2 + y 2 = a 2 , where we hâve written ÿ instead of y, since y 
will be needed later. Let us see what this circle is contracted into if we 
“contract” the plane to the Ox-axis with coefficient b/a (figure 66). 
After this “contraction” the x-values of ail points remain the same, but 
the ÿ-values become equal to y = ÿ (b/a), i.e., ÿ = (a/b) y. Substitu- 
tingÿ in the above équation of the circle, we will hâve: 



as the équation, in the same coordinate System, of the curve obtained 
from the given circle by contraction to the £?x-axis. As we see, we obtain 
an ellipse, Thus we hâve proved that 
an ellipse is the resuit of a “con¬ 
traction” of a circle. 

From the fact that an ellipse is a 
“contraction” of a circle, many 
properties of the ellipse follow di- 
rectly. For example, the afore- 
mentioned property of diameters, 
namely that if parallel sécants of an 
ellipse are given, then their midpoints 
lie on a straight line (see figure 12), 
can be shown in the following way. 

We perform the inverse expansion 
of the ellipse into the circle. Under 
this expansion parallel chords of the 
ellipse go into parallel chords of the Fig. 66. 

circle, and their midpoints into the 

midpoints of these chords. But the midpoints of parallel chords of a 
circle lie on a diameter, i.e., on a straight line, and so that the midpoints 
of parallel chords of the ellipse also lie on a straight line. Namely, they 
lie on that line which is obtained from the diameter of the circle under 
the “contraction” which sends the circle into the ellipse. 

Here is another application of the theory of “contraction.” Since any 
vertical strip of the circle under its contraction to the Ox-axis does not 
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change its width and its length is multiplied by b/a , the area of this strip 
after contraction is equal to its initial area multiplied by b/a , and since 
the area of the circle is equal to nai 1 , the area of the corresponding ellipse 
is equal to tt a 2 (b/a) = irab. 

Example of the solution of a more complicated problem. Let an el¬ 
lipse be given and let it be required to find the triangle with smallest area 
circumscribed to this ellipse. We first solve the problem for a circle. We 
show that in the case of a circle, this is an équilatéral triangle. Indeed, 
let the circumscribed triangle be nonequilateral; i.e., the smallest of its 
angles (denoted by B) is less than 60°, and the largest C > 60°. If then, 
without varying the angle A, we move side BC into the position B 0 C 0 
(figure 67) by shifting the vertex B toward A until one of the angles B 0 
or C 0 becomes equal to 60°, we obtain a circumscribed triangle AB 0 C 0 



with smaller area, since here* OC < O B, OC 0 ^ OB 0 and therefore the 
discarded area OBB 0 is greater than the added one OCC„. If the triangle 
so obtained is not équilatéral, then by repeating the above procedure 
we reduce its area still further and arrive at an équilatéral triangle. Hence, 
any nonequilateral triangle circumscribed to a given circle has a greater 
area than an équilatéral one. 

We now return to the ellipse. Let us make an expansion of it from 
the major axis, thereby converting it back into the circle from which it 
was obtained by “contraction.” Under this expansion (figure 68): (1) ail 


As can be easily shown. 
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triangles circumscribed to the ellipse are transformée! into triangles 
circumscribing the résultant circle; (2) the areas of ail figures, and in 
particular of these triangles are increased in one and the same ratio. 
From this we see that the triangles circumscribing the given ellipse with 
the smallest area will be those that are converted into équilatéral triangles 
circumscribing the circle. There are infinitely many such triangles; each 
of them has its center of gravity at the center of the ellipse and the points 
of tangency are in the middle of its sides. Any of these triangles can 
easily be constructed (figure 68), starting from the aforementioned 
circle. 

“Contractions” of the plane to a line are only a particular case of 
more general, so-called affine transformations of the plane. 

General affine transformations. A pair of vectors e,, e 2 starting from 
a common origin O and not lying on the same line will be called a co- 
ordinate “frame” of the plane. The coordinates of a point M of the 
plane relative to this frame Oe,e 2 will then be numbers x, such that 
in order to reach the point M from the origin O it is necessary to lay 
off from the point O *-times the vector e, and then y-times the vector e 2 
This is a general Cartesian coordinate System of the plane Analogously, 
a general Cartesian coordinate System can be introduced in space The 
ordinary, so-called rectangular Cartesian coordinate System that we hâve 
made use of up to now corresponds to the particular case when the 
coordinate vectors e ,, e 2 are mutually perpendicular and their lengths 
are equal to the unit of measurement 

A general affine transformation of the plane is one under which a 
given net of equal parallelograms is transformed into another arbitrary 
net of equal parallelograms. More precisely, it is a transformation of 
the plane under which a given coordinate frame Oe,e 2 is transformed 
into a certain other frame (generally speaking, with another “metric,” i.e., 
with different lengths for the vectors e\ and e 2 and a different angle 
between them) and an arbitrary point M is sent into the point M' having 
the same coordinates relative to the new frame as M had relative to the 
old (figure 69). 

“Contraction” to the 0*-axis with coefficient A: is a spécial case in 
which the rectangular frame Oe\e 2 passes into the frame Oe\ké 2 . 

It can easily be shown that under an affine transformation every 
straight line is sent into a straight line, parallel fines are mapped into 
parallel fines, and if a point divides a segment in a given ratio, then its 
image divides the image of this segment in the same ratio. Moreover, we 
can prove the remarkable theorem that any affine transformation of the 
plane can be obtained by performing a certain rigid motion of the plane 
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onto itself, and then, in general, two “contractions” with different 


coefficients Ar, and k 2 to two mutually perpendicular Unes. 

For the proof of this assertion, we 



consider ail radii of some circle of the 
plane (figure 70). Let radius OA be the 
one which, after the transformation, 
turns out to be the shortest, and let 
it be mapped into O'A'. The perpen¬ 
dicular AB to OA is then transformed 
into A'B', which must be perpendicular 
to O'A', since if the perpendicular O'C’ 
were different from O'A', then it would 
be the image of the oblique OC, and 
the image O'D' of the radius OD would 



be a part of the perpendicular O'C', 
i.e., shorter than the oblique O'A', 
contrary to assumption. 

The mutually perpendicular fines OA 
and AB are therefore mapped into 
mutually perpendicular fines O'A' and 
A'B'. Consequently, the square net 
constructed on OA and AB is trans¬ 
formed into a net of equal rectangles 
(figure 71) and uniform “contractions” 
take place along the straight fines of 
this square net. 

In a completely analogous way a 
general affine transformation of space 
can be defined as one under which a 
space coordinate frame Oe x e# 3 is 
transformed into some other frame 
0‘e x é 2 e 2 , generally speaking, with a 
different “metric,” i.e., with unit 


Fig. 69. segments of different lengths and with 

different angles between them, and 
a point M is sent into. point M' having the same coordinates 
relative to the new frame as those of the point M relative to the old 
frame. 


Ail the properties enumerated here also hold for affine transformations 
of space, except that in the last theorem there will be a rigid motion of 
space and then three “contractions” to three mutually perpendicular 
planes with certain coefficients k lt k 2 , k 3 . 






§11. AFFINE AND ORTHOGONAL TRANSFORMATIONS 


233 




Applications of affine transformations. The most important applications 
of affine transformations are: 

1. In the first place there is the application in geometry to solving 
problems concerning affine properties of figures, i.e., properties that are 
preserved under affine transformations. The theorem about the diameters 
of an ellipse and the problem of circumscribed triangles were examples. 
To solve such problems we make an affine transformation of the figure 
to some simpler one, for which we prove the desired property and then 
return to the original figure. 

2. Second, there is the application in analytic geometry to the classifica¬ 
tion of second-order curves and surfaces. The main point is, as can be 
shown, that different ellipses are related to one another in the sense that 
one can be obtained from another by an affine transformation (the Latin 
word affinis means “related”). Also ail hyperbolas are affine to one 
another, and so are ail parabolas. But we cannot convert an ellipse into 
a parabola, or a hyperbola into a parabola, by an affine transformation, 
i.e., they are not affinely related to one another. It is natural to divide 
up ail second-order curves into affine classes of curves, affinely related 
to one another. lt turns out that the réduction of an équation to canonical 
form gives exactly this classification; i.e., there are nine affine classes of 
second-order curves. (We will not go into detail why imaginary ellipses 
and pairs of imaginary parallel lines belong to different affine classes. 
Properly speaking, neither in one case nor in the other are there any 
curves on the plane at ail. The question here is really about algebraic 
properties of the équation itself.) 

Similarly, the classification of second-order surfaces according to their 
canonical équations into 17 forms is the same as the affine classification. 

Let us give a simple example of the application of the affine classification 
of second-order surfaces. We show that if we arbitrarily select in space three 
lines a, b , c such that (1) any two of them do not lie in the same plane 
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(i.e., they are skew to each other) and (2) they are not ail parallel to one 
and the same plane, then the set of ail straight Unes d of space, each of 
which simultaneously intersects ail three given Unes a, b, c (figure 72) 
constitutes the entire surface of a hyperboloid of one sheet. 




Let us explain more fully the set of fines d we are discussing here. 
Through an arbitrary point A of fine a, we can pass a plane P containing 
the fine b and a plane Q containing the fine c. These planes P and Q 
intersect in a unique fine d , which passes through the point A of fine a 
and intersects fines b and c. Drawing ail such fines d through arbitrary 
points of fine a , we obtain the set of ail those fines d of space each of 
which intersects ail three given fines a, b, and c. This collection of fines 
détermines a surface. We note that any given hyperboloid of one sheet 
can be obtained in this way, since we only need to take for the fines 
a, b , and c three distinct straight fines a 0 , b 0 , c 0 of one family (figure 73) 
and for the fines d ail the straight fines of the other family. Conversely, 
let there be given three arbitrary pairwise skew fines of space a, b, c, 
not ail parallel to one and the same plane. Then, as can be shown, these 
fines always form the three edges (without common points) of some 
parallelepiped (figure 74). After constructing such parallelepipeds for the 
given fines a, b, c and for three fines a 0 , b 0 , c 0 of one and the same 
family of an arbitrary hyperboloid of one sheet, we make an affine 
transformation of space that sends the parallelepiped a 0 , b 0 , c 0 into the 
parallelepiped a, b, c; obviously, this transformation maps the hyperboloid 
onto the surface in question. But according to the affine classification 
of second-order surfaces, the affine image of a hyperboloid of one sheet 
is again a hyperboloid of one sheet. 

3. Third, there is the application to the theory of continuous trans¬ 
formations of continuous media, for example, in the theory of elasticity, 
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in the theory of electric or magnetic fields, etc. Very small éléments of 
the given continuons medium transform “almost” affinely. So to speak, 
“in the small the transformation is linear” (we call a first-degree expression 
linear, and in the following section we see that in analytic geometry the 
formulas of affine transformations are of the first degree). This is évident 
in figure 75. On the fines of the large square net, their distortion or 





“fanning out” is clearly noticeable. But for a small piece of the very 
dense square net, ail this shows itself very little, and the square net 
transforms “almost” into a net of equal parallelograms. A similar picture 
is obtained in space also (figure 76). By the fact that any affine trans¬ 
formation of space reduces to a motion and three mutually perpendicular 
“contractions,” it follows that an element of a body under an elastic 
deformation first moves as a rigid body and then undergoes three mutually 
perpendicular “contractions.” 

Formulas of affine transformations. If the frame Oe x e 2 is affinely 
transformed and O'e\e' t is its image, while the coordinates of the new 
origin O' relative to the old frame are f, q and the coordinates of the 
vectors e, and e 2 relative to the old frame are a ,, a 2 and b x , b 2 , 
respectively, then the formulas of the affine transformation, as can be 
easily seen from figure 77, are 

*' = a,x + b t y + £, 

y" = a 2 x + bjy + q 

in the sense that if x, y are the coordinates of any point M relative to 
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the old frame Oe x e 2 , then ÿ given by these formulas, are the coor- 
dinates relative to the same frame of the image M' of this point. 

Indeed, let Oe x e 2 be a 
frame before transformation, 
and 0'e\e' 2 its image, while 
M is an arbitrary point of 
the plane and M' is its 
image. Then by the very 
définition of an affine trans¬ 
formation, if the coordinates 
of the point M relative to 
the frame Oe x e 2 are x, y, 
then the coordinates of its 
image M' relative to the 
image Oe\e 2 of this frame 
are exactly the same x, y. 

Now consider a vector 
ni joining the origin O of 
the old frame to the image M' of the point M. Then m' = x'e x + ÿe 2 ■ 
But this vector is equal to a certain vector sum 

m = + r)e 2 + xe\ + ye, , 

and the vectors e\ and e 2 are 

e\ = + a # 2 , e 2 = V, + * 1*2 

so that 

m’ = £e, + r/e 2 + a,xe, + a 2 xe 2 + b l ye l + b t ye 2 
or 

m' = (a x x + b t y + ^)e, + ( a 2 x + bjy + -q)e 2 . 

Comparing this expression with the first expression for m we obtain 



x' = û,Ar + b,y + / 

/ = a^x + bi)' rj. ' 


( 21 ) 


The déterminant 


A = 


a x b x 
a 2 b 2 


= a l b 2 ~ ° 2 & 1 . 


as can be shown, is not zéro and is equal to the ratio of area of the 
parallelogram constructed on the vectors of the new frame to the area 
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of the same parallelogram constructed on the vectors of the old frame. 
Analogous formulas are obtained for space 

x' = a,x + b t y + c,z + Ç, \ 

y' = dzx + b^y + c 2 z + tj, (22) 

z' = a 3 x + b 3 y + c 3 z + {, ) 

where (£, q, £) are the coordinates of the origin O' of the transformed 
frame O'e\e' 3 e 3 and (a,, a 2 , a 3 ), (b, , b 2 , b 3 ), (c,, c 2 , c 3 ) are the coor¬ 
dinates of its vectors e\ , e 2 , e 3 relative to the old frame Oe l e 2 e 3 . 

The déterminant* 

o, b t c t 

A = a 3 b 2 c 2 = a,b 2 c 3 a 2 b 3 Ci + a 3 b t c 2 — a 1 b 3 c 2 — o 2 b l c 3 — o 3 b 2 C\ 

a 3 b 3 c 3 

is not zéro and is equal to the ratio of the volume of the parallelepiped 
formed by the vectors of the new frame to the volume of the parallelepiped 
formed by the vectors of the old frame. 

Orthogonal transformations. Rigid motions of the plane onto itself 
or such motions plus a reflection about a line lying in the plane, are 
called orthogonal transformations of the plane , and rigid motions of space, 
or such motions plus a reflection of the space about one of its planes, 
are called orthogonal transformations of space. It is clear that orthogonal 
transformations are affine transformations under which the “metric” of 
the frame does not change, since it only undergoes a rigid motion, or 
else such a motion plus a reflection. 

We will investigate orthogonal transformations by means of rectangular 
coordinates, i.e., when the vectors of the original frame are mutually 
perpendicular and hâve lengths equal to the unit of measurement. After 
an orthogonal transformation the vectors of the frame remain mutually 
perpendicular, i.e., their scalar product remains equal to zéro and their 
lengths remain equal to unity. Therefore (see formula (14), this chapter) 
in the case of the plane, we hâve 

a A + = °» a \ + a \ = 1. b\ + b\= 1, (21') 

and in the case of space 

a \b\ + û 2^2 + ct 3 b 3 = 0, a 2 , + a 2 + a 2 = 1, 

a i c i + a z c i + a 3 c z = 0, b\ + b\ + b\ = 1, (22') 

£,<?, + b^ + b 3 c 3 = 0, Ci + cl + 4 = 1. 

* On déterminants see Chapter XVI. 
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Hence, if the initial frame is taken to be rectangular, then formulas (21) 
give an orthogonal transformation if and only if the conditions (21’) of 
orthogonality are fulfilled, and formulas (22) give an orthogonal trans¬ 
formation of space if the conditions (22') of orthogonality are satisfied. 
It can be shown that if A > 0, we hâve a rigid motion, and if A < 0 
a rigid motion plus a reflection. 


§12. Theory of Invariants 

The concept of invariant.* Invariants of a second-degree équation with 
two variables. In the second half of the last century still another 
important new concept was introduced, that of invariant. 

Consider, for example, a second-degree polynomial in two variables 

Ax 2 + 2Bxy + Cf + 2 Dx + 2 Ey + F. (23) 

If we regard x, y as rectangular coordinates and make a transformation 
to new rectangular axes, then after replacing x, y in (23) by their ex¬ 
pressions in terms of the new coordinates x\ y', removing parenthèses, 
and reducing similar terms, we obtain a new transformed polynomial 
with different coefficients 

A'x' 2 + IB'x'ÿ + C'y' 2 + 2D'x' + 2 E'ÿ + F'. (24) 

It turns out that there exist expressions formed from the coefficients 
which under this transformation do not change their numerical value, 
although the coefficients themselves change. Such an expression in 
A', B', C', D\ £', F' has exactly the same numerical value as when it is 
formed with the A, B, C, D, E, F. 

Expressions of this kind are called invariants of the polynomial (23) 
with respect to the group of orthogonal transformations (i.e., relative to 
transformations from one set of rectangular coordinates x, y to any 
other rectangular coordinates x\ y'). 

Invariants of this sort, as it turns out, are 


/. 

h 


h 


A + C, 


A B 
B C 


= AC - B 2 , 


A B D 
BCE 
DEF 


— ACF 2 BDE - AE 2 - CD 2 - FB 2 , 


* Invarians in Latin means “unchanged.” 
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i.e., 

A + C = A' + C', AC — B 2 = A’C ‘— B" 2 , 

ACF + 2 BDE— AE 2 — CD 2 — FB 2 

= ACF + 2 BDE’ — A'E' 2 — CD' 2 — F'B' 2 . 

It is possible to prove the important theorem that any orthogonal 
invariant of the polynomial (23) can be expressed in terms of these three 
basic invariants. 

If we equate the polynomial (23) to zéro, we obtain an équation of 
some second-order curve. Any quantity, connected with this curve but 
not with its location in the plane, will clearly not dépend on what coor- 
dinates its équation is written in, and therefore, when expressed in terms 
of the coefficients, it will be an orthogonal invariant of the polynomial 
(23), and thus it will be expressible in terms of the three basic invariants. 
Moreover, since under multiplication of ail six coefficients of the équation 
by any given number t (different from zéro) the curve represented by 
the équation remains the same, an expression of any property of the 
curve in terms of the /, , / 2 , / 3 must certainly be such that if the A, fl, 
C, D, E, F in it are multiplied by t, the number t cancels out. The ex¬ 
pression in question must be, as they say, homogeneous of degree zéro 
relative to A, fl, C, D, £, F. 

Let us verify this by an example. For instance, let the équation 

Ax 2 + 2Bxy + Cy 2 + 2 Dx + 2Ey + F = 0 

represent an ellipse. Since the équation completely détermines this ellipse, 
we can calculate from it (i.e., from its coefficients) ail the basic quantifies 
connected with the ellipse. For example, we can calculate its semiaxes a 
and b, i.e., we can express the semiaxes in terms of the coefficients. The 
expressions for these semiaxes will be invariants and therefore, expressible 
in terms of /,, / 2 , /,. By réduction of the équation to canonical form 
and some subséquent calculation, the following rather complicated 
expressions for the semiaxes are obtained in terms of /,, / 2 , / 3 : 


/ 2 | / 3 1 
V | / 2 1 | /, ± V/ 2 - 4/ 2 | ’ 

which are homogeneous relative to A, fl, C, D, E, F. 

From this it is clear that the invariants /,, / 2 , / 3 , themselves, being 
homogeneous but not of degree zéro, do not hâve straightforward 
géométrie meanings; they are algebraic entities. 
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It can be shown that the expression 




A D C E 
D F ^ E F 


AF - FF + CF - E 2 


can be varied by parallel translation but not by pure rotation of the 
given rectangular axes, and it is therefore called a semi-invariant. 

As an example of an application of invariants and semi-invariants, 
we give Table 1, which if we calculate /,, / 2 , / 3 and À,, allows us to 
détermine directly from its équation the affine class of a second-order 
curve. 

In Table 1 the necessary and suffirent conditions are given that an 


Table 1 


Criterion of the class 

Name 

Réduction équation 

Canonical 

équation 

/ a > 0, /,/, < 0 

ellipse 


a 2 b * 

/, > 0, /,/, > 0 

imaginary ellipse 


x * y* 

/, > o, /, = o 

point 


x* y* 

S + b>=° 

h < o, /, # 0 

hyperbola 


x * y» 
a * 6* 

/, < 0, /, = 0 

pair of intersect- 
ing lines 


o* b* 

/» = 0, /, # 0 

parabola 


j:* = 2py 

/, = 0,/ s = 0, 

AT, <0 

pair of parallel 
lines 


x * = fl* 

h = 0, /, = 0, 

K x > 0 

pair of imaginary 
parallel lines 

o 

'i 

-r* = —fl* 

/j = 0, /j = 0, 

K t = 0 

pair of coincident 
lines 


** = 0 
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équation of a second-order curve be reducible to one or another of the 
nine canonical forms (/,/ 3 désignâtes the product of I t and / 3 ). 

Consider, for example, the équation x 2 — 6x>> + Sÿ 2 — 2x + 4y + 3 = 0. 
We hâve A= 1, 5= -3, C=5, D= -1, £= 2, F= 3, so that /, = 6, 
/ 2 = —4, / 3 = —9. The conditions of the 4th line of the table are 
satisfied: / 2 < 0, / 3 ^ 0, i.e. this is a hyperbola. Its semiaxes are 
equal to 


V 


2-9 


4 ■ 1 6 ± V36 + 16 


0.57 and 1.93. 


The coefficients of the reduced équation (1), (II) and (III) are given 
in ternis of invariants and semi-invariants as follows: 


A,x' 2 + A^' 2 + ^ = 0, 
h 


/.x' 2 + 2^- !±y- = 0, 


/,*'* + y- 1 = 0, 

'i 


(I) 

(H) 

(Ml) 


where A and A 2 are the roots of the so-called characteristic quadratic 
équation 

A 2 — /,A + / 2 0. 

Formulas (1-111) allow a quick calculation of the semiaxes a and b 
of an ellipse and a hyperbola, the parameter p of an ellipse and the 
distance 2 a between parallel lines. The formulas for semiaxes were given 
earlier. The parameter p is equal to 




and the distance 




A completely analogous theory of invariants and semi-invariants, with 
a corresponding table for the détermination of the affine class and the 
formulas of the coefficients of reduced équations, can be given for second- 
order surfaces in three-dimensional space. 

It should be pointed out that so far we hâve been discussing only 
those invariants that are considered in analytic geometry for curves and 
surfaces of the second order. The concept of invariant, however, has a 
far broader meaning. 



242 


]II. ANALYTIC GEOMETRY 


By an invariant of some object under study, relative to certain of its 
transformations, we mean any quantity numerical, vectorial, etc. con- 
nected with this object that does not vary under these transformations. 
In the previous problem the object is a second-degree polynomial with 
two variables (i.e., more precisely, the set of its coefficients), and the 
transformations are those of the polynomial obtained by the transition 
from one rectangular coordinate System to another. 

Another example: The object is a given mass of a given gas under 
a given température. The transformations are changes in volume or 
pressure of this mass of gas. The invariant, according to the Boyle- 
Mariotte law, is the product of the volume by the pressure. We can 
speak of lengths of segments in space or the size of angles as invariants 
of the group of motions of space, of ratios in which a point divides a 
segment, or of ratios of areas, as the invariants of the group of affine 
transformations of space, etc. 

Various invariants are particularly important in physics. 

§13. Projective Geometry 

Perspective projections. Artists began long ago to study the laws of 
perspectivity. This was necessary because a human being sees objects in 
perspective projection on the retina of the eye, in such a way that the 



form and mutual location of objects are distorted in a characteristic 
manner. For example, telegraph pôles in the distance look smaller and 
doser together, parallel tracks of a railway seem to converge, etc. We 
will not consider here space perspectivities, i.e., properties of perspective 
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projections of objects in space onto a plane but only the properties of 
perspective projections of a plane onto a plane. 

Let us consider a photograph (for example, one frame of a moving 
picture film) P, a screen P', and between them a lens S (figure 78). Then, 
if the photograph is transparent and is illuminated from behind (if it is 
nontransparent, let it be illuminated from the front, i.e., from the side 
where the lens is), then the illuminated points of the photograph radiate 
beams of light, which are collected by the lens in such a way that they 
appear again on the screen P' in the form of points. We will assume that 
this projection takes place as if the points of the photograph P were 
projected on the screen P' on straight fines passing through the optical 
center S of the lens. 

The situation will be a very simple one if the planes P and P' are 
parallel. In this case we will obviously obtain on the plane P' an un- 
distorted image of everything that is on the plane P. This image will be 
smaller or larger than the original depending on whether the ratio d':d, 
where (/and d' are the distances from the center of the lens to the planes 
P and P' respectively, is smaller or larger than 1. 

The situation will be considerably more difficult if the planes P and P' 
are not parallel (figure 79). In this case, under projection through the 
point 5 not only the size of the figure changes but also its form is distorted. 
Parallel fines under such projection may become convergent, the ratio 
in which a point divides a segment may change, etc. In general, some 
of the relations that remain invariant under arbitrary affine transformation 
may change here. 

This sort of projection takes place, for example, in aerial photography. 
The airplane oscillâtes in ffight and therefore the photographie apparatus 
(figure 80a) rigidly attached to it is, in general, not oriented altogether 
vertically but at the moment of exposure is usually in an oblique position, 
i.e., we obtain a distorted image of the locality (which we assume to be 
plane). 

How are we to correct this image ? For this it is necessary to study 
the properties of projection of a plane P onto another plane // (in general, 
the two planes are not parallel) by fines passing through a point 5 which 
is not on plane P nor on plane 11. Such projections are called perspective 
projections. 

We will prove later the following important theorem. 

Theorem. !f we hâve two perspective projections of a plane P on plane II 
such that under both projections the points A, B, C, D of a quadruple of 
points of “general position” on the plane P {i.e., a quadruple in which no 
three of the points lie on one line), are projected into the same points A', 
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B', C', D' respectively of plane 11, then al! points of plane P are also 
Project ed under bot h projections into the same points of plane 11. 

In other words, the resuit of a perspective projection is completely 
determined if it is known into which points this projection sends the 
points of an arbitrary quadruple of points of general position in the 
figure to be projected. 

This is the so-called uniqueness theorem of the theory of projective 
transformations or the fundamental theorem of plane perspectivity. 

Application of the fundamental theorem of plane perspectivity in aerial 
photography. Let us show how this theorem provides a suitable method 
for correcting this image in photography. 

If at the moment of aerial exposure, we imagine a horizontal screen 11 
placed at a distance h below the center 5 of the lens (figure 80a), then 
the projection onto this screen through the center S of the image recorded 
on the photographie plate P will obviously not be distorted but will be 
similar to the horizontal locality with a scale h\H , where H is the height 
of the airplane at the moment of exposure. In order to correct the image 
received on the photograph P so as to convert it into an undistorted 
image, we treat it as follows. The developed photograph P is placed in 
a projecting apparatus resting on a spécial tripod on which, by means 
of adjustable screws, the apparatus can be moved doser to the screen II 
or farther from it and can be rotated in every way. 

To the screen II (figure 80b) we attach a topographical map of the 
locality made by measurements on the surface of the Earth (not a detailed 
map, since the details of interest to us are to be provided by the aerial 
photograph). On this map attached to the screen 11 we select four points 
A', B', C ', D 'that can be found easily on the photograph also (for example, 
an intersection of roads, a corner of a house, etc.), and at the corresponding 
points A, B, C, D of the picture P we pierce the film with a needle. We 
then place a projection lamp behind the plate P in such a position that 
the picture is projected onto the screen 11 through a lens 5 of the sup- 
porting apparatus. By using the adjustable screws we arrange that the 
light beams from the pinholes fall on the corresponding points A\ B\ 
C', D' of the map attached to the screen. After this has been done, we 
replace the topographical map by a plateholder with a photographie 
plate and then, without changing the settings of the screws, we photograph 
the image projected on the screen 11 of the picture P taken from the 
airplane. 

By the theorem stated previously, we thereby obtain a true (i.e., 
similar to the locality) and not a distorted map of the photographed 
région. 
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(b) 

Fig. 80. 
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We now pass to the présentation of the theory necessary for proving 
the fundamental theorem. 

The projective plane. The totality of ail lines and planes of space 
passing through a given point 5 of the space is called the projecting 
bundle of lines and planes with center S. If this bundle is intersected 
by a plane P, not passing through the center, then to every point of the 
plane P will correspond a line of the bundle intersecting the plane P 
in this point, and to each line of the plane P will correspond that plane 
of the bundle which intersects the plane P along this line. However, we 
do not in this way establish a one-to-one mapping from the set of lines 
and planes of the bundle of the set of points and lines of the plane P. 
As a matter of fact, the lines and planes of the bundle which are parallel 
to the plane P do not in this sense correspond to any points or lines of 
the plane P, since they do not intersect it. Nevertheless, we agréé to say 
that these lines of the bundle intersect the plane P but in its idéal (or 
infinitely distant) points, lying in the corresponding directions, and that 
such a plane of the bundle intersects the plane P along an idéal (or infinitely 
distant) line. The plane P , compiemented by these idéal points and idéal 
line, is called a compiemented or projective plane. We will dénoté it by P*. 
The sets of lines and planes of the bundle 5 are then mapped one-to-one 
onto the sets of points (real and idéal) and lines (real and idéal) of this 
projective plane P*. 

Hence, we agréé to say that a point (real or idéal) lies on a line (real 
or idéal) of the projective plane P* if the corresponding line of the bundle 
lies in the corresponding plane of the bundle. From this point of view, 
any two lines of the projective plane intersect (in a real or idéal point), 
since any two planes of the bundle intersect along some line of the bundle, 
It follows from this, among other things, that the idéal line consists 
simply of the set of ail idéal points. 

In essence, the complémentation of the plane by its idéal éléments 
means that we use this plane as a cross section to study the bundle of 
ail lines and planes passing through one point. 

Projective mappings; the fundamental theorem. By a projective map¬ 
ping we understand such a mapping of a projective plane P* onto some 
other projective plane P*' (which can also coïncide with the plane P*, 
in which case we speak of a projective transformation of the plane P*), 
which, first of ail, is pointwise a one-to-one mapping, and second, is 
such that collinear sets of points of the plane P* go into collinear sets 
of points of the plane P*', and conversely. (Hère, by points and lines 
we always understand real as well as idéal points and lines.) 
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It is clear that two arbitrary perspective projections of one and the 
same plane P* onto a plane II* may be obtained from each other by 
projective transformations. 

In fact, 1, Their points (real or idéal) are in one-to-one correspondence 
with the points (real or idéal) of the projective plane P* and consequently 
with each other, and 2, collinear points of the first projection correspond 
to collinear points of the plane P* and consequently also of the second 
projection, and conversely. Therefore, the aforementioned theorem of 
the theory of perspectivities is a direct conséquence of the following 
theorem about projective transformations: if under a projective trans¬ 
formation of the plane II*, four of its points A, B, C, D, forming a 
quadruple of general position remain fixed, then ail of its points remain 
fixed. 

Let us outline the idea of the proof of this theorem by means of the 
so-called Môbius net. 

We note that (1) if under a projective transformation two points 
remain fixed, then the line that passes through them is mapped into 
itself, and (2) if two fines are mapped into themselves, then the point 
of their intersection remains fixed. Therefore, from the fact that the 
points A, B , C, D of the plane II* remain fixed, it follows in turn that 
also the points £, F, G, H , K, L , etc. remain fixed (figure 81). The con¬ 




struction of such points can be continued by joining the points already 
obtained. This is the so-called Môbius net. By continuing its construction, 
we can find points as densely placed as we like. It can be shown that the 
set of these nodes everywhere densely covers the whole plane. Therefore, 
if we further assume the continuity of a projective transformation (which 
in fact, already follows from its définition, although the proof of this 
fact is not easy), the resuit is that if under a projective transformation 
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of the plane II* the points A, B, C, D remain fixed, then ail the points 
of the plane II* remain fixed. 

Projective geometry. By two-dimensional projective geometry we mean 
the totality of theorems about those properties of figures in the projective 
plane, i.e., the ordinary plane complemented by idéal éléments, which 
do not change under arbitrary projective transformations. 

Here is an example of a problem of projective geometry. Given two 
fines a and b and a point M (figure 82), the problem is to construct the 
fine c passing through the point M and through the point of intersection 
of fines a and b , not using this point of intersection (as may be necessary, 
if this point is very distant). If through the point M we draw the two 
sécants 1 and 2 and then the fines 3 and 4 through the points of their 
intersection with fines a and b , we obtain the point K. Let us draw through 
it fine 5 and sécants 6 and 7; then it can be shown that the fine c passing 
through point L of intersection of fines 6 and 7 and point M , is the desired 
fine. 

From the theory of conic sections, it follows (figure 83) that the ellipse, 
hyperbola and parabola are perspective projections of one another, and 
moreover ail of them are perspective projections of the circle. 

If we regard perspective projections as projective transformations of 
projective planes P * and P*' one onto the other, then by superposing 
these planes we obtain the resuit that ail ellipses, hyperbolas, and para- 
bolas are projective transformations of the circle. The différence in them 




is that projective images of the circle under transformations in which 
a fine not intersecting the circle is mapped into the infinitely distant fine 
are ellipses; on the other hand, if a fine tangent to the circle is mapped 
into the infinitely distant fine, then a parabola is obtained, and if a sécant, 
then a hyperbola (figure 84). 
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The notation of projective transformations in formulas. If on the 

plane P* we take an ordinary Cartesian coordinate System, then, as can 
be shown, the formulas for projective transformations of the plane are 
as follows 

x . _ a i x +bp> + Ci . = QtX + b^y -f c 2 
a 3 x + bty + c 3 ’ y a 3 x + b 3 y + c 3 ' 


where the déterminant 


ai b t 
a 2 b 2 

a 3 b 3 


Cl 

c% 


9 * 0 , 


<•3 


and conversely. 

If for some point (*, y) the denominators are equal to zéro, this means 
that its image (*', y’) is an idéal (infinitely distant) point. The équation 


a 3 x + b^ -f c 3 = 0 


represents the line which under the given projective transformation goes 
into the idéal (infinitely distant) line. 


§14. Lorentz Transformations 

The dérivation of the formulas of the Lorentz transformation for motion 
on a straight line and in the plane from the condition of the constancy of 
the speed of light. At the very end of the 19th century a fundamental 
contradiction was discovered in physics. Michelson’s well-known ex- 
periment, in which the speed of light (which is about 300,000 km/sec) 
was measured in the direction of motion of the Earth along its orbit 
around the Sun (the speed of the Earth is about 30 km/sec) and per- 
pendicular to this direction, showed irrefutably that ail moving bodies 
in nature, even if they are moving in a vacuum, are contracted in the 
direction of motion. The theory of this contraction was investigated in 
detail by the Dutch physicist, Lorentz. He showed that this contraction 
is greater as the speed of the moving body gets doser to the speed of 
light in a vacuum, andat a speed equal to the speed of light the contraction 
becomes infinité. Lorentz derived the formulas for this contraction. But 
shortly afterwards the physicist Einstein introduced into this problem a 
completely different point of view, to which Poincaré was already close. 
Einstein argued as follows. If we assume that for the propagation of 
light, as for ordinary motion of a material body, Galileo’s law of composi¬ 
tion of velocities is valid, then the speed of light is c = c + v, where v 
is the speed of the observer moving toward the propagation of the light. 
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and c is the speed of light for a stationary observer. From Michelson’s 
experiment it follows that c' = c- The law c' = c + v is based on the 
transformation 

x = x + V x t, 

f = I, (25) 

connecting the coordinate x of a point relative to a coordinate System 1 
with its coordinate x relative to a coordinate System 11 which has its 
axes parallel to the axes of System 1 and which moves parallel to the 
Ox-axis with velocity v* relative to System I. Clearly, these are the 
formulas, as Einstein says, that must be changed. 

lt can be shown, as was recently done, for example, by A. D. 
Aleksandrov, that from the equality of the speed of light in both coor¬ 
dinate Systems x, y, z, t and y', z', t' it already follows that the formulas 
of transformation from coordinates x, y, z, t to coordinates x', ÿ, z', t' 
are linear and homogeneous, i.e., hâve the form 

x = a x x -|- b t y -f c,z + </,f, 
ÿ = a 2 x + b^y + c 2 z + d 2 t, 26 

z' = a 3 x + b^y + c 3 z + d 3 t, 1 

t' = a A x + b A y + c A z + d A t. 

From other considérations one can show that their déterminant* is 
equal to unity. 

If a point in System 1 moves rectilinearly and uniformly in an arbitrary 
given direction with the speed of light c, then x = v x t, y = ty, z = v,t 
and v\ + vl + = c 2 , from which 

x 2 + y 2 + z* — c 2 l 2 = 0. (27) 

But according to Michelson’s experiment this point in System II also 
necessarily moves with the same speed of light c, so that it is also necessary 
that 

x 2 + /* + z' 2 — c 2 t' 2 = 0. 

Consequently the formulas (26) are not just arbitrary transformations 
which are linear, homogeneous, and with déterminant equal to 1, but 
must at the same time satisfy the condition that if the coordinates x, y, z, t 
are such that 

x 2 + y 2 + z 2 — c 2 t 2 = 0, 

then the transformed coordinates x', y', z', Y must also satisfy this 
équation. Such transformations (26) are called Lorentz transformations. 


See Chapter XVI. 
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Let us first consider the simplest case, when the point moves along 
the Ox-axis. In this case formulas (26) hâve the form 


x' = a,x + t/,f. 
t ' = a 2 x + d 4 . 


and équation (27) 


x* — c 2 r 2 


0 . 


(26') 


(27') 


Let us introduce the notation cl = u, when formulas (26') and équation 
(27') take the form 


and 


, d i 

x = a t x + -£■ u < 

dtc 

u' = QfCX + -y u 


x* - U* = 0. 


(26.) 


Let us find the explicit forms of formulas (26,). Consider x and u as 
a Cartesian rectangular system in the plane, i.e., consider the problem 
geometrically; then we may regard 
formulas (26,) as those of an affine 
transformation of the plane Oxu (whose 
déterminant, as was shown is equal 
to 1). We will dénoté this transformation 
by L. If, as we assume, x 2 — u 2 = 0 
implies x' 2 + u' 2 = 0, then this trans¬ 
formation translates the intersecting 
straight lines 

x 2 — u 2 = 0 

into themselves. The transformation L 
is therefore a combination of a contrac¬ 
tion and expansion with identical coef- Fig. 85. 

ficients r along these lines. 

From figure 85 we obtain 



x = 


x/2 V2 


u = -P- + -!L, 
V2 + V2 
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But after the transformation L, the numbers p and q will go into p — p/r 
and q' = qr, so that 

X'V2=Z- qr, 
u V2 = £ + qr. 

Expressing p and q in terms of x and u from the first pair of équations, 
substituting into the second and simplifying, we obtain 

1 T 2 - 1 

c T* + I * 

2t 

T* + 1 


x — 


- 1 


X = 


+1 


Cl 


I' = 


+1 


or, setting (r 2 — 1)/(t 2 + l)c = v, we hâve 

v , _ X — Vt f _ t — ( vx/c 2 ) 

Vl -(v/cj*’ - VY'- iv/cy' 

which are the famous Lorentz formulas. 

In particular, if we take x = 0, i.e., if we consider the motion of the 
origin of coordinate System 1, we obtain 

, _ —vt , _ t 

V1 — (u/c) 2 ’ V 1 — (u/c) 2 ’ 

or x = — vt', from which obviousiy v is the speed of motion of coordinate 
System 11 relative to System 1. 

Suppose, for example, that we are given two points on the Ox-axis 
with coordinates x, and x 2 relative to system 1, so that the distance 
between them relative to system 1 is r = | x, — x 2 1. Let us see what the 
distance between them is for an observer attached to system 11. We hâve 


, _ x, — vt , _ x 2 — vt 

X ' ~ Vl - (c/c ) 2 ’ ** "" VT- (y/c ) 2 * 

from which 


r’ = \x[-x’ i \ 


1*1 ~ *2 I 

Vl - (v/c ) 1 


The factor V 1 — (v/c ) 2 is exactly the coefficient of the Lorentz con¬ 
traction. Since c is very large, this coefficient is very close to 1 for 
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moderately large v, and therefore the contraction is not significant. But 
such elementary particles as électrons or positrons often move with 
velocities comparable to the speed of light, and therefore in studying 
their motion it is necessary to take this contraction into account, or, 
as they say, to consider the relativistic effect. 

We pass now to the case next in complexity, namely, when the point 
moves in the Oxy- plane. For this case the transformations (26) will hâve 
the form 

x = a,x + b x y + d x t, 

ÿ = a 2 x + b + d 2 t, (26’) 

f = a 3 x -f- b 3 y + d 3 t, 

where 

a, b x d x 
a 2 b 2 d 2 = 1, 
a 3 b 3 d 3 

and équation (27) will be 

x* + y 2 — c 2 t* = 0. (27') 

These are the Lorentz formulas for motion in the Oxy-plane. 

Again we put et = u. Then transformations (26’) can be rewritten as 

x' = a x x + b x y + ^ «, 

ÿ = a2 x + brf + ^ u > ( 26 s) 

• d 3 c 

u = a 3 cx + b 3 cy + «, 

where the déterminant will again be equal to one, and équation (27") 
will assume the simpler form 

x 2 + ÿ>-u 2 = 0. (27 2 ) 

We will regard x , y, u as the Cartesian rectangular coordinates of a 
point in ordinary three-dimensional space and will consider formulas (26 2 ) 
as those of affine transformations of this space. Equation (27 2 ) represents 
a straight circular cône K with an angle of 90° at the vertex (figure 86). 

From the point of view of this géométrie interprétation (we call it 
géométrie because here we regard u = et simply as a space coordinate) 
of a Lorentz transformation, the set of motions in the plane is identical 
with the set of ail equi-affine (i.e., affine and volume-preserving) trans¬ 
formations of the space which map the cône K onto itself. 
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Let us consider some spécial Lorentz transformations. 

1. lt is clear that any simple rigid rotation about the axis of the cône K 
through an angle w is an equi-affine transformation of space, mapping 
the cône K into itself, i.e., it is a spécial Lorentz transformation. We will 
dénoté it by w. 

2. Réfactions of the space in an arbitrary plane ir passing through 
the axis of the cône K are clearly also Lorentz transformations. We will 
dénoté them by tt. 

3. Finally, let us consider the following transformation (figure 87). 
Let v and w be any pair of opposite generators of the cône, and let P 
and Q be the planes tangent to the cône along these generators. These 




Fig. 87. 


planes are mutually perpendicular. Let us make a contraction of the 
space to the plane P and an expansion of it with the same coefficient 
from the plane Q, or conversely. For example, we contract the space 
by a factor of three to the plane P and expand it also by a factor of three 
from the plane Q. Such a transformation of space is clearly also affine 
and préserves ail volumes. We will dénoté it by L. We show that this 
transformation maps the cône K into itself. Because the cône K has the 
axis u as its axis of révolution, any figure can be rotated in such a way 
that the generators v and w lie, for example, in the plane Sxu. Therefore 
it is sufficient to carry out the proof for this case. 

For the proof we intersect the cône K by an arbitrary plane R parallel 
to the Sxu plane. The équation of this plane is y = b, where b is a 
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constant. Substituting this value in the équation of the cône K , we obtain 

x 2 — u 2 = —b 2 . 

This is the équation of a hyperbola for which the lines of intersection 
of the plane R with the planes P and Q are exactly the asymptotes. But 
since for a point of such a hyperbola it is characteristic that the product 
of distances p and q to the asymptotes, i.e., to the planes P and Q, is 
constant, under transformation L ail points of this hyperbola remain on 
the same hyperbola, and the hyperbola is mapped onto itself. But the 
whole surface of the cône K consists of such hyperbolas, and therefore 
under the transformation L of the space the cône K is sent into itself. 
This transformation L is therefore also a Lorentz transformation. 

Since under affine transformations straight lines go into straight lines, 
and intersecting lines go into intersecting lines, therefore a bundle 5 of 
straight lines under any Lorentz transformation is mapped one-to-one 
onto itself. Moreover, under affine transformations of space ail planes 
go into planes, so that under these transformations of the bundle 5 onto 
itself a projective transformation of the bundle is obtained. If we intersect 
this bundle by a plane FI perpendicular to the axis of the cône K , which 
as a whole is not altered by the given Lorentz transformation of space, 
and extend this plane to the projective plane II* and then trace the points 
of intersection of the lines of the bundle S with the plane II*, we hâve 
the resuit that the Lorentz transformations ofthe bundle will simultaneous- 
ly produce projective transformations A of the plane II* and these latter 
will transform the circle a, in which the plane II* intersects the interior 
part of the cône K , into itself. To analyze the properties of Lorentz 
transformations, it is easier to examine these projective transformations 
A of the circle a into itself. 

Projective transformations of a circle into itself. A point, a ray or 
half line issuing from it, and one of the half planes eut off by the entire 
line will be called a “frame” of the plane II* (not to be confused with 
a coordinate frame, §11). We show (figure 88) that if we take two arbitrary 
frames M and M' containing interior points of the circle a, then by 
means of the transformations L, u>, tt can send one of these frames 
into the other. For this it is sufficient to make the transformations 
A = L x ■ u> ■ Lz x (or else A = L, ■ w ■ n ■ L^ x ). The transformation L, 
sends the first frame M to the center O of the circle a, the transformation 
<jo rotâtes it as necessary, and finally the transformation LJ 1 brings it 
into coincidence with the second frame M'. 

Let us show, in addition, that there is only one transformation A which 
translates a given frame M into a given frame M'. In order to do this 
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we observe first that if there were two transformations A t and A 2 sending 
frame M into frame M\ then the transformation A = A,A 2 l would not 
be the identity transformation A and would send frame M into itself. 
Therefore, it is sufficient to show that 


if a transformation A sends frame M 
into itself, then it is the identity, i.e., 
leaves ail points of the plane of circle 
a. fixed. 




Fig. 88. 


Fig. 89. 


Let us show this. Suppose that the transformation A sends frame M 
into itself (figure 89). Then it maps the line AB of this frame into itself, 
but since it sends the circumference of the circle a into itself, it therefore 
leaves points A and B fixed, or else interchanges them. The latter, however, 
is impossible, since the half line of the frame is mapped into itself. Let us 
draw the tangents at points A and B to the circle a. They are mapped 
into themselves, since if such a tangent were mapped into a sécant AÂ, 
then the inverse transformation would send the different points A and A 
of the circle a into the one point A. But the A are projective transforma¬ 
tions, and consequently one-to-one. Since under the transformations A 
these tangents go into themselves, therefore the point N of their inter¬ 
section remains fixed, and consequently the line MN is mapped into 
itself. From the fact that the half line of the frame M is mapped into 
itself, we conclude as above that points C and D are not interchanged, 
but remain fixed. Hence, under the given projective transformation A 
of the projective plane II* four of its points A, B, C, D, no three of which 
lie on the same line, remain fixed. According to the uniqueness theorem 
of projective transformations this is the identity transformation. 

Later, in §5 of Chapter XVII it will be shown that by using these 
properties of the Lorentz group, it is easy to construct a model of 
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Lobaéevskil’s plane geometry, and if we consider Lorentz transformations 
for the general case of motion of a point in space, then we can do the 
same for Lobaéevskil’s space geometry, and thereby prove its consistency. 

We see that the theory of Lorentz transformations, projective geometry 
and the theory of perspectivity and non-Euclidean geometry are closely 
related to one another. It turns out that there is still another theory that 
is also closely related to them, namely, the so-called conformai transfor¬ 
mations in the theory of functions of a complex variable, which solve 
such important problems of mathematical physics as the distribution of 
température in a heated plate, the flow of air around the wing of an 
airplane, the distribution of charge in a plane electrostatic field, the 
problems of elasticity in the plane, and many others. 


Conclusion 

Analytic geometry is an absolutely indispensable method for the 
investigation of other branches of mathematics, physics, and other 
natural sciences. Therefore it is studied not only at universities but in 
ail technical higher institutions of learning, and also in some vocational 
schools. Ft is also a question of whether we should not include a fairly 
detailed treatment of the éléments of analytic geometry in high school 
courses. 


Various coOrdinates. The essential éléments of the concept of analytic 
geometry, as we hâve seen, are the coordinate method and the investigation 
of équations connecting these coordinates. Besides Cartesian coordinates, 
other different ones can be considered. For example, in the plane, we 
can choose a point P (the so-called pôle) and a ray originating from it 
(the polar axis) and détermine the position of a point M by the length p 
of the polar radius from the pôle to the point and the value w of the 
angle made by this radius with the polar axis (figure 90). 

In particular, the ellipse, hyperbola, or parabola, if for the pôle we take 



M 



Fig. 90. 


Fig. 91. 
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a focus, and for the polar axis the ray passing from the focus along the 
axis of symmetry to the side opposite the nearer vertex (figure 91), hâve 
one and the same équation 


H 1 — « cos tu ’ 

where f is the eccentricity of the curve, and p is its so-called parameter. 
This équation is of a great importance in astronomy. For it was with 
its help that the resuit was derived, from the law of inertia and the law 
of universal gravitation, that the planets revolve about the Sun in ellipses. 

The geographical coordinates, latitude and longitude, by which the 
position of a point is given on a sphère, are well known. 

Analogously, we can take a coordinate network on an arbitrary surface, 
as is done in differential geometry (see Chapter VII), etc. 

Many-dimensional and infinite-dimensional analytic geometry; algebraic 
geometry. It would seem that in the 19th century analytic geometry under- 
went such an immense development, described earlier in a general way, and 
produced so many ideas, that it would hâve necessarily exhausted itself, 
but this is not so. In very recent times, two new, extensive branches of 
mathematics hâve been rapidly developed and hâve extended the concepts 
of analytic geometry, namely so-called functional analysis and general 
algebraic geometry. It is true that both of these only halfway represent 
a straightforward continuation of classical analytic geometry: Much of 
functional analysis is analysis, and in algebraic geometry there is more 
than a little of the theory of functions and of topology. 

Let us explain what we mean. In the middle of the last century 
mathematicians had already begun to consider four-dimensional and 
general n-dimensional analytic geometry, i.e., to study those questions 
of algebra that are straightforward generalizations of algebraic questions 
of the kind involved in two- and three-dimensional analytic geometry, 
to the case when there are four or n unknowns. At the very end of the 
19th century a sériés of outstanding analysts came to the idea that for 
the purposes of analysis and mathematical physics it is significant to 
consider infinite-dimensional analytic geometry. 

At first glance it may seem that n-dimensional or even four-dimensional 
spaces seem like farfetched mathematical fictions, then the same can also 
be said about an infinite-dimensional space. But it is not really so. The 
arguments concerning an infinite-dimensional space are not at ail difficult. 
They now constitute an important branch of mathematics, functional 
analysis (see Chapter XIX). 
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It is curious that infinite-dimensional analytic geometry has most 
important practical applications and plays a fundamental rôle in con- 
temporary physics. 

As to algebraic geometry, it is a more immédiate continuation of 
ordinary analytic geometry, which is itself only a part of algebraic 
geometry. Algebraic geometry can be regarded as that part of mathematics 
which is occupied with curves, surfaces, and hypersurfaces, represented 
in Cartesian coordinates by algebraic équations of not only first and second 
degree, but also of higher degrees. It turns 
out that in these investigations it is advanta- 
geous to consider not only real but also 
complex coordinates, i.e., to consider every- 
thing in a so-called complex space. The most 
important results in this domain were 
obtained in the last century by Riemann. As 
a brilliant example of theorems about higher 
order curves, we point out a remarkably 
general resuit of I. G. Petrovskil about the 
number of ovals into which an nth-order curve 
can be decomposed. Petrovskil showed tljat 
if p is the number of such ovals which do not lie at ail in other ovals, 
or lie in an even number of ovals, and m is the number of those ovals 
which lie in an odd number of ovals, and if we consider only curves 
whose component ovals neither intersect themselves nor each other 
(figure 92), then 



3n 2 — 6n 

p - m ^---+ 1, 


where n is the order of the curve, i.e., the degree of the équation by which 
the curve is represented. 

This resuit is the more important as up to then almost nothing had 
been known about the general form of a higher order curve. It is no 
doubt one of the most important recent general theorems in analytic 
geometry. 
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CHAPTER 


IV 


ALGEBRA : THEORY 
OF ALGEBRAIC EQUATIONS 


§1. Introduction 

The characteristic features of algebra are well known to everyone, since 
the elementary but fundamental information about it is already given in 
high school. Algebra is characterized, first of ail, by its method, involving 
the use of letters, and expressions in letters, on which we perform opera¬ 
tions according to definite laws. In elementary algebra the letters dénoté 
ordinary numbers, so that the laws of operations on expressions in letters 
are based on the general laws of operations on numbers. For example, 
the sum does not dépend on the order of the summands, a fact which in 
algebra is written as: a + b = b + a; in multiplying the sum of two 
numbers, we can multiply each one of the numbers individually and then 
add the products so obtained: (a -f b)c = ac + bc, etc. 

If we trace the proof of an algebraic theorem, it is easy to see that it 
dépends only on these laws for operations on numbers and not at ail on 
what the letters represent. 

The algebraic method, i.e., the method of calculations with letters, 
pénétrâtes ail of mathematics. In fact, substantial part of the solution of a 
mathematical problem often turns out to be nothing but a more or less com- 
plicated algebraic computation. Besides, in mathematics we employ various 
symbolic calculations in which the letters no longer dénoté numbers but 
some other entities, where the laws for operations on these entities may be 
different from the laws of elementary algebra. For example, in geometry, 
mechanics, and physics we make use of vectors, and as is well known, the 
laws for operations on vectors are in part the same as for numbers and in 
part essentially different. 
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The significance of the algebraic method in modem mathematics and 
the range of its applications hâve greatly increased in recent décades. 

First of ail, the growing demands of technology force us to reduce to 
numerical results the solutions of difficult problems of mathematical 
analysis, and this usually proves to be feasible only after the algebraiza- 
tion of these problems, a process which in turn créâtes new and sometimes 
difficult problems in algebra itself. 

Second, certain problems of analysis became clear and understandable 
only after they were attacked by algebraic methods based on a profound 
generalization (to the case of infinitely many unknowns) of the theory 
of Systems of équations of the first degree 

Finally, the more advanced parts of algebra hâve found application in 
contemporary physics: In fact, the fundamental concepts of quantum 
mechanics are expressed in terms of complicated and nonelementary 
algebraic entities. 

The basic features of the history of algebra are as follows. 

First of ail we must point out that our ideas regarding what algebra 
is and what its fundamental problem consists of hâve changed twice: 
once in the first half of the past century, and the second time at the 
beginning of our century. Thus, algebra has meant at different times three 
quite different things. In this respect the history of algebra differs from 
the history of the three famous branches of mathematics: analytic geo- 
metry, differential calculus, and intégral calculus, which were forged into 
shape at the hands of their creators, Fermât, Descartes, Newton, Leibnitz, 
ahd others and were later rapidly developed and amplified, sometimes by 
the addition of great new sections, but were comparatively little changed 
in their fundamental character. 

In ancient times any law that was discovered for the solution of a class 
of mathematical problems was recorded simply in words, since symbolic 
calculations had not yet been invented. The word “algebra” itself was 
created from the name of the important work of the Kharizmian scientist 
of the 9th century, Mohammed Al-Kharizmi (see Chapter I), in whose 
works the first general law for the solution of first- and second-degree 
équations was deduced. However, the introduction of the symbolic 
notation itself is usually associated with the name of Viète, who not only 
began to dénoté the unknowns by ietters but also the given quantifies. 
Descartes also did a great deal for the development of symbolic notation, 
and he too, of course, took the Ietters to mean ordinary numbers. It is at 
this moment that algebra really begins as the science of symbolic calcula¬ 
tions, of transformations of formulas composed of Ietters, of algebraic 
équations, and so forth, in contrast to arithmetic, which always opérâtes 
on concrète numbers. Only now did complicated mathematical concepts 
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become perspicuous and accessible to investigation, since by taking a 
look at a formula in letters, it is in most cases possible for us to see its 
general arrangement or law of formation and to subject it to suitable 
transformations. At that time everythingin mathematics which was neither 
geometry nor infinitésimal analysis was called algebra. This is the first, 
the so to say Viète point of view, concerning algebra. Ft was very clearly 
expressed in the well-known book “Introduction to Algebra” by a membcr 
of the Russian Academy of Sciences, the famous L. Euler, written in the 
1760’s, i.e., 200 years ago. 

Euler defined algebra as the theory of calculations with quantifies of 
various kinds. The first part of his book contains the theory of calculation 
with intégral rational numbers, ordinary fractions, square and cube roots, 
the theory of logarithms, progressions, the theory of calculations with 
polynomials, and the theory of Newton’s binomial sériés and its applica¬ 
tions. The second part consists of the theory of first-degree équations and 
of Systems of such équations, the theory of quadratic équations and of 
solutions of third- and fourth-degree équations by radicals, and also an 
extensive section on methods of solutions of various indeterminate 
équations in integers. For example, it was shown that Fermat’s équation 
x 3 + y 3 = z 3 cannot be solved in integers x, y, z. 

At the end of the 18th and the beginning of the 19th century, one of the 
problems of algebra gradually began to occupy the central place, namely 
the theory of solution of algebraic équations, in which the fundamental 
difficulty is the solution of an nth-degree algebraic équation with one 
unknown 

x n + a,*"- 1 + a*x"- 2 + ••• -(- a„-iX -F a„ = 0. 

This happened as a naturel conséquence of the importance of the problem 
for the whole of pure and applied mathematics, and also because of the 
difficulty and depth of the majority of the theorems connected with it. 

The general formula for the solution of a quadratic équation, 



was known to everybody. Italian algebraists of the 16th century found 
analogous, though more complicated, general rules for the solution of 
arbitrary third- and fourth-degree équations. Further investigations in 
this direction for higher degree équations, however, met with insurmount- 
able difficultés. The greatest mathematicians of the 16th, 17th, 18th and 
the beginning of the 19th century (Tartaglia, Cardan, Descartes, Newton, 
d’Alembert, Tschirnhausen, Bézout, Lagrange, Gauss, Abel, Galois, 
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Lobadevskil, Sturm, and others) created an impressive edfice of theorems 
and methods connected with this problem. The two-volume algebra of 
Serret (an epoch-making work of its time, since it presented for the first 
time the high point of the theory of algebraic équations, namely the theory 
of Galois), appeared in the middle of the 19th century, exactly 100 years 
after Euler’s text, in it algebra was already defined as the theory of alge¬ 
braic équations. This is the second point of view concerning what algebra 
is. 

In the second half of the past century there occured, on the basis of the 
ideas of Galois about the theory of algebraic équations, a profound 
development of group theory* and the theory of algebraic numbcrs (in 
the création of which a great part was played by the Russian mathematician 
E. I. Zolotarev). 

In this second period also, in connection with the same problems of 
solution of an algebraic équation, and with the theory of algebraic varieties 
of higher order (which were then being studied in analytic geometry) the 
algebraic apparatus was developed in different directions, e.g., the theory 
of déterminants and matrices, the algebraic theory of quadratic forms and 
linear transformations, and, in particular, the theory of invariants. During 
almost the entire second half of the 19th century, the theory of invariants 
was a central theme in algebra. In turn, the development of group theory 
and the theory of invariants exerted in this period a great influence on 
the development of geometry.t) 

A new, third point of view as to what algebra is came into existence 
chiefly in the following connection. In the second half of the last century, 
in mechanics, physics, and mathematics itself, scientists began more and 
more often to investigate objects for which it was natural to consider 
operations of addition and subtraction, and sometimes multiplication 
and division, but for which these operations were subjected to altogether 
different laws from those for rational numbers. 

We hâve already spoken of vectors. Other sorts of mathematical objects 
with different laws of operation can only be mentioned here: e.g., matrices, 
tensors, spinors, hypercomplex numbers. Ail these quantities are denoted 
by letters, but their laws of operation differ from one another. If for some 
set of objects (denoted by letters) certain operations are defined together 
with the laws or rules that they must satisfy, then we say that an algebraic 
System is defined. The third point of view on what algebra is consists of 
regarding the whole of algebra as the study of various algebraic Systems. 
This is the so-called axiomatic or abstract algebra. It is abstract because 


* See Chapter XX. 
t See Chapter XVII. 
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at a given step in the calculation we are not ail concerned with what the 
letters in the algebraic System dénoté, the only important thing is the 
axioms or laws satisfied by the operations; and it is called axiomatic, 
because it is constructed exclusively from the axioms stated at the begin- 
ning. It is as though we hâve returned, but on a higher level, to the first or 
Viète point of view on algebra, that algebra is the theory of symbolic 
calculations. Although it makes no différence what the letters dénoté and 
only the raies of operation are important, it is still true, of course, 
that only those algebraic Systems are interesting which hâve great 
significance either in mathematics itself or in its applications. 

The great amount of algebraic material collected in the previous period 
served as the actual basis for the construction of contemporary abstract 
algebra. 

The early 1930’s saw the appearance of van der Waerden’s well-known 
book “Modem Algebra,” which has played a great rôle in the propagation 
of this third point of view as to what algebra is. The text of A. G. KuroS 
on algebra is oriented in the same direction. 

In the présent century algebra has found deep applications to geometry 
(topology and the theory of Lie groups) and, as mentioned earlier, to 
contemporary physics, especially to functional analysis and quantum 
mechanics. 

Particularly important at the présent time are the problems of mecha- 
nization of algebraic calculations by means of various mathematical 
computing machines, especially high-speed electronic machines. The 
questions connected with this type of computational mathematics raise 
new distinctive problems in algebra. 

In the présent work, there are two chapters (not counting the présent 
one) that are devoted to algebra: linear algebra (Chapter XVI) and the 
theory of groups and other algebraic Systems (Chapter XX). 


§2. Algebraic Solution of an Equation 

An algebraic équation of the nth-degree with one unknown is an 
équation of the form 

+ a,*"" 1 + a^x”- 2 + — + a n ^x + a n = 0, 
where a,, a 2 , •••, a„ are given coefficients.* 


* We assume thaï ail terms of the équation are transferred to the left-hand side 
and that the équation is divided by the coefficient of the highest power of the unknown. 
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Equations of the first- and second-degree. If the équation is of the 
the first-degree, then it has the form 

x + a = 0 


and is solved at once 


x = —a. 


The second-degree équation 

x 2 + px + q = 0 

was solved in early antiquity. lts solution is very simple: lf we transfer q 
with the opposite sign to the right-hand side and then add p 2 /4 to both 
sides we hâve 

2 P 2 P 2 

* 2 + p* + t - <!■ 

But 

x 2 + px + ^ = (* + , 

hence 

X + P ï = ± yl E Â~ (l ' 

from which we obtain the well-known formula for the solutions of a 
quadratic équation 



Third-degree équation, lt was completely different with équations of 
degree higher than 2. Already the general équation of the third-degree 
required quite profound considérations and resisted ail the efforts of the 
mathematicians of antiquity. lt was only solved at the beginning of the 
1500’s, in the era of the Renaissance in ltaly, by the ltalian mathematician 
Scipio del Ferro. Del Ferro, following the custom of his time, did not 
publish his own discoveries but communicated them to one of his pupils. 
After the death of del Ferro this pupil challenged to compétition one of 
the great ltalian mathematicians Tartaglia and proposed to him for 
solution a sériés of third-degree équations. Tartaglia (1500-1557) accepted 
the challenge and eight days before the end of the compétition found a 
method of solving any cubic équation of the form x 3 + px + q = 0. 

In two hours he solved ail problems of his opponent. A professor of 
physics and mathematics in Milan, Cardan (1501-1576), learning of 
Tartaglia’s discoveries, began to entreat Tartaglia to inform him of his 
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secret. Tartaglia finally agreed, but with the condition that Cardan keep 
his method in deep secret. Cardan violated his promise and published 
Tartaglia’s resuit in his work “The great art” (“ Ars Magna"). 

The formula for the solution of a cubic équation has since then been 
called Cardan’s formula, although it would be correct to call it Tartaglia's 
formula. 

Cardan’s formula is derived as follows. 

In the first place, the solution of the general cubic équation 

y 3 + af + by + c = 0 (1) 

can easily be reduced to the solution of the cubic équation of the form 

x 3 + px + q = 0, (2) 

not containing a term with the square of the unknown. To do this it is 
sufficient to set y = x —a/3, lndeed, substituting this expression into 
équation (1) and removing the parenthèses, we obtain 

(* - 5) + a (* — f) + b (* ~ 5) + c = x 3 - 3 x 2 j + ••• + ax 2 + —, 

where the dots indicate those terms in which x is raised to first power or 
does not appear at ail. We see that the terms containing x 2 cancel each 
other out. 

Let us now consider the following équation 
x 3 + px + q = 0. 

We set x = u + v, i.e., in place of one unknown we put two, u and v, 
and thereby turn the whole problem into a problem with two unknowns. 
We hâve 

(u + v) 3 + P(u + v) + q = 0, 
or 

« 3 + f 3 + q + (3mp + p)(u + v) = 0. 

Whatever is the sum of the two numbers u + v, itisalways possible to 
require that their product uv be equal to some quantity given beforehand. 
If u 4 - v = A, and we require uv = B, then since v = A — u, we obtain 

u(A - u) = B, 

so that it is sufficient that « be a solution of the quadratic équation 

u 2 - Au + B = 0, 

and we know that every quadratic équation has real or complex roots, 
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given by the well-known formula. In our case, u + v is equal to the desired 
root x of our cubic équation and we require that 

uv = - y 


i.e., that 3 uv + p = 0. With this choice of u and v we obtain 

K 3 + v 3 + q = 0, 

3 uv + p = 0. 


(3) 


Consequently, if we find the numbers u and v, satisfying this System of 
équations then the number x = u + v will be the root of our équation. 

From System (3) it is easy to form a quadratic équation whose roots will 
be u 3 and r 3 . lndeed, it gives 

h 3 + f 3 = ~ Q, 


and, consequently by a theorem already used earlier u 3 and r 3 are the 
roots of the quadratic équation 


+ qz ~ri 


0 . 


Solving it by the usual formula, we obtain 



and, consequently, 

this is the formula of Cardan. 


Fourth-degree équation. Soon after the solution of the cubic équation 
the general fourth-degree équation was solved by Ferrari (1522-1565). 
For the solution of the third-degree équation we hâve seen that the 
preliminary solution of the auxiliary quadratic équation, 


z 3 + qz - g = 0, 


was necessary, where z — u 3 or u 3 ; analogously, the solution of a fourth- 
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degree équation can be based on the preliminary solution of an auxiliary 
cubic équation. 

Ferrari’s method consists of the following. Let the general fourth- 
degree équation be given 

x* + ax 3 + bx 2 + ex + d = 0. 

Let us rewrite it as: 

x* + ax 3 - — bx 2 — ex — d 

and add to both sides a 2 x 2 / 4; then on the left we obtain a perfect square 



Adding now to both sides of the équation the terms 



where y is a new variable, on which we later impose a necessary condition, 
on the left we obtain a perfect square 

(*■+?+§'-(î-* + ^+(î-*)* + (Ç-4 ( 4 > 

Thus we hâve reduced the problem to one with two unknowns. 

On the right of équation (4) we hâve a quadratic trinomial in x, whose 
coefficients dépend on y. We select y such that this trinomial will be the 
square of the first-degree binomial ooc + P- 
ln order that the quadratic trinomial Ax 2 + Bx + C be the square of 
the binomial ax + j8 it is sufficient that 

B 2 -4 AC = 0. 

lndeed, if B 2 — 4AC — 0, then 

Ax 2 + Bx + C = WÀx + VC) 2 , 

Le., 

Ax 2 + Bx+C = (ooc+ P) 2 , 

where 

« = Va, P = VC- 

Consequently, if we select y such that 
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then the first part of équation (4) will be the complété square (a.x + fif. 
Removing the parenthèses, we obtain a cubic équation in y 

y 3 — by 2 + (ûc — 4 d)y — [d{a 2 — 4 b) + c 2 ] = 0. 

Solving this auxiliary cubic équation (for example, by Cardan’s formula) 
we find a and fi in terms of its solution y 0 , namely 

(** + T + T-) , = («+0> 2 , 

from which 

+ ™ + ^ = a x + P or x* + ™ + ?*= -ax-p. 

From these two quadratic équations we can find ail four roots of the given 
fourth-degree équation. 

This is how third- and fourth-degree algebraic équations were solved 
by ltalian mathematicians in the 1500’s. 

The success of the ltalian mathematicians produced a very great effect. 
lt was the first instance when modem science had exceeded the achieve- 
ments of the ancients. Until then, in the whole course of the Middle 
Ages, the aim had always been only to understand the work of the ancients, 
and now, finally, certain questions were solved which the ancients had not 
succeeded in conquering. And this happened in the 1500’s, i.e., in the 
century before the invention of new branches of mathematics: analytic 
geometry, differential calculus, and intégral calculus, which finally affirmed 
the superiority of the new science over the old. After this there was no 
important mathematician, who did not attempt to extend the achievements 
of the ltalians and to solve équations of fifth, sixth, and higher degree in 
an analogous way by means of radicals. 

The prominent algebraist of the 17th century, Tschirnhausen (1651— 
1708) even believed that he had finally found a general method of solution. 
His method was based on the transformation of an équation to a simpler 
one, but this very transformation required the solution of some auxiliary 
équations. Subsequently, by a deeper analysis it was shown that Tschirn- 
hausen’s method of transformation indeed gives the solution of second- 
third-, and fourth-degree équations, but already fora fifth-degree équation 
it requires the preliminary solution of an auxiliary équation of the sixth- 
degree, whose solution in turn was not known. 
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Factorization of a polynomial and Viète’s formulas. If we accept 
without proof the so-called fundamental theorem of algebra* that every 
équation 

f(x) = 0, 

where f(x) = x n + a, -K" -1 + ••• + a„ 

is a polynomial in x of given degree n and the coefficients a lt a 2 , ■■■, a n 
are given real or complex numbers, has at leasl one real or complex root, 
and take into considération that ail computations with complex numbers 
are carried out by the same rules as with rational numbers, then it is easy 
to show that the polynomial f(x) can be represented (and in only one way) 
as a product of first-degree factors 

f(x) = (x - aXx - b) — (x - /), 

where a, b, —, I are real or complex numbers. 

Indeed, let a be a root o{/[x); we divide/O) by x — a; since the divisor 
is of the first-degree, the remainder will be a constant number R, i.e., we 
will hâve the identity 

f(x) = (x-a) /,(*) + R, 

where /,(*) is a polynomial of degree n — 1 and R is a constant. Substi- 
tuting hère in place of x the number a , we obtain 

/(û) = (a - a) /,(a) + R = R. 

But since a is a root of /(x), we hâve /(a) = 0, and hence R = 0, i.e., 
a polynomial can always be divided by (* — a) without remainder, where 
a is a root of this polynomial. Thus 

f(x) = (.v - a) /,(*). 

But if the fundamental theorem of algebra is true, then in turn the poly¬ 
nomial f,(x) has a root b, and we obtain analogously 

fi(x) = (x - b) fax), 

where the polynomial f^x) is already of degree (n — 2), etc. This factoriza¬ 
tion, as can easily be shown, is unique. 

Every nth-degree polynomial f(x) has in this sense n and only n roots 
a, b, —, I. These roots may be ail distinct but it can happen that some 
among them are identical. Then we say that the corresponding root of 

* The proof of the fundamental theorem of algebra is difficult and was given con- 
siderably later. We devote §3 to it. But its validity was assumed long before it was 
rigorously proved. 
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the polynomial f(x) is a multiple root with such and such a multiplicity. 
Multiplying out the expression 

(x - a)(x - b){x - c) — (x - /) 

and comparing the coefficients of the sa me powers of x, we see immedia tely 
that 

—a, = a -F b -F c -F ••• + /, 
o 2 = ab + ac + ■" + kl, 

—a 3 = abc + abd + —, 


±a n = abc ■■■ I 
which are Viète’s formulas. 

A theorem on symmetric polynomials. Viète’s formulas are poly- 
nomialsin the n letters a, b, —, / which do not vary underany permutation 
of these letters. Indeed, a + b+'" + k + l= b + a + — + k + I, 
etc. In general, any such polynomials in n letters, which do not change 
under any permutations of these letters, are called symmetric polynomials 
in n letters. For example, Sx 2 + 5 y 2 — Ixy is a symmetric polynomial in x 
and y. It is possible to prove the theorem that every intégral symmetric 
polynomial in n letters with arbitrary coefficients A, B, ••• can be expressed 
intégral rationally, i.e., with the operations of addition, subtraction, and 
multiplication, in terms of the coefficients A, B, ••• and of Viète’s poly¬ 
nomials in the letters. If a, b, —, / are the roots of an nth-degree équation 
x n F a l x n ~ l -F ••• -F a„ = 0, then every symmetric polynomial in 
a, b, —, / with arbitrary coefficients A, B , can thus be expressed intégral 
rationally in terms of these coefficients A, B, ••• and the coefficients 
a,, a 2 , a n of the équation. This is the so-called fundamental theorem 
of symmetric polynomials. 

Lagrange’s contributions. The famous French mathematician Lagrange 
in his great work “Reflections on the solution of algebraic équations” 
published in 1770-1771 (with more than 200 pages), critically examined 
ail the solutions of second-, third- and fourth-degree équations that were 
known up to his time and showed that their success was always based on 
properties which did not hold for équations of degree 5 or higher. From 
del Ferro’s time until this work of Lagrange more than two and a half 
centuries had passed by and nobody during this long interval had doubted 
the possibility of solving équations of degree 5 and higher by radicals, 
i.e., of finding formulas involving only the operations of addition, sub¬ 
traction, multiplication, division, and radicals with intégral positive 
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exponents, which would express the solution of an équation in terms of 
its coefficients, that is, formulas similar to those by which the quadratic 
équation had been solved in antiquity and the third- and fourth-degree 
équations in the 1500’s by the Italians. They regarded this situation as 
being due only to their own inability to find a valid but apparently deeply 
hidden solution. 

Lagrange says in his memoir: “The problem of solving (by radicals) 
équations whose degree is higher than four is one of those problems which 
hâve not been solved although nothing proves the impossibility of solving 
them” and two pages later he suppléments this: “From our reasoning we 
see that it is very doubtful that the methods which we hâve considered 
could give a complété solution of équations of the fifth-degree.” 

In his investigations, Lagrange introduced the expression 

a - 1 - tb ■+■ e*c 4- ••• 4- t "-•/ 

in the roots a, b, —, I of an équation, where t is an nth root of unity,* 
having established that such expressions are closely connected with the 
solution of équations by radicals. These expressions are now called 
“Lagrange resolvents.” 

In addition, Lagrange observed that the theory of permutations of roots 
of an équation is of great importance in the theory of solution of équations. 
He even expressed the thought that the theory of permutations is the 
“true philosophy of the whole question,” in which he was completely 
right, as was shown in the later investigations of Galois. 

Lagrange's method of solution of second-, third- and fourth-degree 
équations were not the same as those of the Italians, which in every case 
were based on spécial transformations of a complicated and so to speak 
accidentai kind. Lagrange’s methods were altogether orderly and devel- 
oped from one general idea involving the theory of symmetric polynomials, 
the theory of permutations, and the theory of resolvents. 


* I.e., a complex number which raised to the mh power is equal to one. For example, 
the cube roots of unity can hâve the values 

I V3 | Vj 

'• -2 + T # * -Î-T'* 


where / = v — I 


(see §3 ). Tndeed, 

I ^3 \ 3 I 

-2 + T ') =-8 




and analogously 
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Let us consider, for example, the solution by Lagrange’s method of 
the general fourth-degree équation 

x* + mx 3 + nx 2 + px + q = 0. 

Let the roots of this équation be a , b, c, d. Consider the resolvent 

a + b — c — d, 

ie., 

a + te + t 3 b + t 3 d , 

where e = — 1. If we permute a, b , c, d in ail 1-2-3-4 = 24 different ways, 
we obtain altogether six different expressions 

a -f b — c — d, 
a + c — b — d, 

a + d - c - b, (5) 

c + d — a — b, 
b + d — a — c, 
b + c — a — d. 

An équation of the sixth-degree, whose roots are these six expressions, 
will thus hâve coefficients that do not vary with ail 24 permutations of 
a, b, c, d , since any of the 24 permutations can only permute these expres¬ 
sions among themselves and the coefficients of the sixth-degree équation 
do not dépend on the order in which we take its roots. Thus, these 
coefficients are symmetric polynomials in a , b, c, d. But then, by virtue of 
the fundamental theorem on symmetric polynomials, these coefficients are 
expressed intégral rationally in terms of the coefficients m, n, p, q of the 
équation. In addition, since expressions (5) are pairwise of opposite signs, 
this sixth-degree équation will contain only terms of even powers. Indeed, 
if expressions (5) are denoted by a, P, y, — a, — /3, — y respectively, then 
the left-hand side of the sixth-degree équation will be equal to 

(y - <x)(y + «Xy - p)(y + P)(y - yXy + y) 

= (.y 2 - oc^y 2 - i8MT2 - y 2 )- 

Direct computation gives the sixth-degree équation 

y * — (3m 2 — 8 n)y* -J- 3(m 4 — 16m 2 n — 16n 2 + 16 mp — 64^)y 2 

— (m 2 — 4m + 8p) 2 = 0. 

Letting y 2 = t, we obtain a cubic équation in i, and if (', t‘" are its 

roots, then 

a + b — c — d = Vt\ 
a + c — b — d — Vü, 
a + d — b — c = VT" . 
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We also hâve 

a + b + c + d= — m- 

Adding these équations after multiplication by 1, 1, 1, I or 1, —1, —1, 1, 
or —1, 1, —1, 1, or —1, —1, 1, 1, we obtain 

o = l(~m + V? + Vr + Vr 7 ), 
b = i(-m + V? - Vr 7 - Vr 7 ), 
c = - v? + Vr - V7 777 ), 

d — \ ( _m — VT — Vr + Vr 7 ). 

Thus, the solution of a fourth-degree équation is reduced to the solution 
of a cubic équation; and third- and second-degree équations are solved 
analogously. 

Lagrange achieved a great deal in the theory of algebraic équations. 
However, even after his persistent efforts the problem of solution in 
radicals of algebraic équations with degree higher than 4 remained to be 
settled. This problem, on which mathematicians had worked in vain for 
almost three centuries, constituted, in the expression of Lagrange, “a 
challenge to the human mind.” 

Abel’s discovery. Consequently it was a great surprise to ail mathe¬ 
maticians when in 1824 the work of a young Norwegian genius Abel 
(1802-1829) came to light, in which a proof wasgiven that if the coefficients 
of an équation a,,a 2 , —, a n are regarded simply as letters, then there 
does not exist any radical expression in these coefficients that is a root of 
the corresponding équation, if its degree n ^ 5. Thus, for three centuries 
the efforts of the greatest mathematicians of ail countries to solve équations 
of degree 5 or higher in radicals did not lead to success for the simple 
reason that this problem simply does not hâve a solution. 

Such a formula is known for second-degree équations, and as we saw 
analogous formulas exist for third- and fourth-degree équations, but for 
équations of degree 5 or greater there are no such formulas. 

AbePs proof is difficult and we will not give it here. 

Galois theory. But this was not yet ail. A very remarkable resuit in 
the theory of algebraic équations still remained to corne. The point is 
that there are arbitrarily many spécial forms of équations of any degree 
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that are solvable in radicals, and many of them are exactly those équations 
that are important in the applications. Such, for instance, are the binomial 
équations x n = A. Abel found another very broad class of such équations, 
the so-called cyclic équations and still more general “Abelian” équations. 
In connection with the problem of construction by ruler and compass of 
regular polygons, Gauss explicitly considered the so-called cyclotomie 
équations, i.e., équations of the form 

x »-\ + x f-t 4 .-(_ x + 1 = 0 , 

where p is a prime number, and showed that they can always be reduced 
to a chain of équations of lower degree; moreover, he found necessary 
and sufficient conditions that such an équation can be solved in square 
roots. The necessity of these conditions was rigorously proved only by 
Galois. 

Thus, after Abel’s work the situation was the following: Although, as 
was shown by Abel, the general équation of degree higher than 4 cannot 
be solved by radicals, there are arbitrarily many different spécial équations 
of arbitrary degree, ail of which can be solved by radicals. The whole 
question of solving équations in radicals was placed by these discoveries 
on completely new ground. It became clear that the task now was to déter¬ 
mine exactly which équations can be solved by radicals, or in other words, 
what are the necessary and sufficient conditions for the solvability of an 
équation in radicals. This problem, the answer to which gave in some 
sense the final élucidation of the whole problem, was solved by the ingen- 
ious French mathematician Evariste Galois. 

Galois (1811-1832) perished at the âge of 20 in a duel. In the last two 
years of his life he could not devote much time to mathematics, since he 
was carried away by the stormy whirl of political life at the time of the 
1830 Révolution and languished in jail for his speech against the reac- 
tionary régime of Louis Philippe. Nevertheless, in his short life, Galois 
made discoveries far ahead of his time in various parts of mathematics 
and in particular produced some very remarkable results in the theory of 
algebraic équations. In a small publication “Memoir on the conditions 
of solvability of équations in radicals” which remained in manuscript 
form after his death and was first published in 1846 by Liouville, Galois 
started from some very simple but profound concepts and finally untangled 
the whole complex of difficulties surrounding the solution of équations 
in radicals, difficulties with which the most outstanding mathematicians 
had struggled unsucessfully up to his time. The success of Galois was 
based on the fact that for the first time he introduced into the theory 
of équations a sériés of exceedingly important new general concepts, 
which subsequently played a great rôle in mathematics as a whole. 
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Let us consider the Galois theory for a spécial case, namely when the 
coefficients a, , a 2 , —, a„ of the given nth-degree équation 

x" + a { x n ~' + — + a n _,x + a„ = 0 (6) 

are rational numbers. This case is particularly interesting and already 
involves essentially ail the difficulties of the general Galois theory. We will 
also assume that the roots a, b, c, — of this équation are distinct. 

Galois begins, like Lagrange, with considering a first-degree expression 
in o, b , c, — 

V = Aa + Bb J- Ce -+ 

although he does not require that the coefficients A, B, C, — of this 
expression should be the roots of unity, but takes for A, B, C, — any 
intégral rational numbers such as to give numerically distinct values for 
ail the n\ ~ 1-2-3— n quantities V, V', V", —, K 1 " 1-11 obtained from V by 
permuting the roots a, b, c, — in ail n! possible ways. This can always be 
done. Then Galois constructs the équation of degree n\ whose roots are 
V, V', V", — , The theorem on symmetruc polynomials shows 

that the coefficients of this équation <P(x) = 0 of degree n\ will be rational 
numbers. 

Up to now everything is quite similar to what Lagrange did. 

Next Galois introduced the first important new concept, the concept of 
irreducibility of a polynomial in a given field of numbers. If a polynomial 
in x is given, whose coefficients, for example, are rational numbers, then 
the polynomial is called reducible in the field of rational numbers if it can 
be represented in the form of a product of polynomials of lower degrees 
with rational coefficients. If not, then the polynomial is called irreducible 
in the field of rational numbers. The polynomial x 3 — x 2 — 4x — 6 is 
reducible in the rational number field, since it is equal to 
(x 2 + 2x + 2)(x — 3), but for instance, the polynomial x 3 + 3x 2 + 3x — 5 
is irreducible, as can be shown, in the field of rational numbers. 

There exist methods, admittedly requiring long computations, of 
factoring any given polynomial with rational coefficients into irreducible 
polynomials in the field of rational numbers. 

Galois then factors the above polynomial #>(*) into irreducible factors 
in the field of rational numbers. 

Let F(x) be one of these irreducible polynomials (which one of them is 
immaterial for what follows) and let it be of degree m. 

The polynomial F(x) will then be the product of m of the n \ first-degree 
factors x — V, x — V', ■■■, x — y 1 "'- 1 ’, into which the w!th-degree 
polynomial <P(x) was decomposed. Let these m factors be x — V, 
x — V', —, x — We enumerate in any order the roots, a, b , c, —, / 
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of the given nth-degree équation (6) by giving them the indices, 1,2, —, n. 
Then the quantities V , V', —, correspond to ail possible n! permu¬ 

tations of the numbers 1, 2, n, corresponding to permutations of the 
roots,and the V, V, —, K"" -1 ' correspond to onlyw of these permutations. 
The set G of these m permutations of the numbers 1, 2, •••, n is called the 
Galois group of the given équation (6).* 

Then Galois introduces some new concepts and develops simple but 
truly remarkable arguments by which he proves that a necessary and 
sufficient condition for the solvability of équation (6) in radicals is that 
the group G of permutations of the numbers 1, 2, —, n satisfies a certain 
definite condition. 

Thus, Lagrange’s prophecy that at the basis of the whole problem lay 
the theory of permutations proved to be true. 

In particular, Abel’s theorem on the nonsolvability of a general fifth- 
degree équation in radicals can now be proved as follows. It can be shown 
that there exist arbitrarily many fifth-degree équations, even with intégral 
rational coefficients for which the corresponding 120th-degree polynomial 
0(x) is irreducible, i.e., whose Galois group is the group of ail 5! = 120 
permutations of the indices 1, 2, 3, 4, 5 of its roots. But this group, as can 
be shown, does not satisfy the Galois criterion, and therefore these fifth- 
degree équations cannot be solved in the radicals. 

For instance, it can be shown that the équation x i 4- x — a = 0, 
where a is a positive whole number, in most cases cannot be solved by 
radicals. For example, it is not solvable in radicals for a — 3, 4, 5, 7, 
8, 9, 10, 11, —. 

The application of Galois theory to the problem of solvability of géométrie 
problems by ruler and compass. One of the most remarkable spécial 
applications of Galois theory is the following. Many problems of plane 
geometry can be solved by constructions with ruler and compass alone. 
For example, we can construct with ruler and compass a regular 
triangle, square, pentagon, hexagon, octagon, decagon, etc., but it is 
impossible to construct a regular polygon of seven, nine, or eleven sides. 
Which problems can be solved by ruler and compass, and which not ? 
Before Galois it was an unsolved problem. From the Galois theory we 
obtain the following answer. 

The simultaneous solution of équations of two Unes, a line and a 
circle, or two circles can be reduced to the solution of équations of first- or 
second- degree. For a line and a circle it is clear, and in the case of two 
circles (x — a t ) 2 -I- (y — b t ) z = rj and (x — a 2 ) 2 + {y — b 2 ) z = /■£ if we 


More will be said about Galois groups in §5, Chapter XX. 
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substract one équation from the other, the x 2 and y 2 cancel out, and we 
obtain a first-degree équation, which is to be solved simultaneously with 
the équation of one of the circles, so that again we hâve a quadratic 
équation. Therefore, every step of the problem to be solved by ruler and 
compass is reduced to an équation of first- or second-degree, and con- 
sequently, ail problems solvable with ruler and compass are reduced to 
an algebraic équation with one unknown, whose solution involves the 
extraction of a chain of square roots. Conversely, if the solution of a 
géométrie problem is reduced to such an algebraic équation, then it can 
be solved by ruler and compass, since square roots, as is well known, 
can be constructed by ruler and compass. 

If a géométrie problem is given, we must first set up an algebraic 
équation équivalent to the given problem. If it is impossible to set up such 
an équation, the problem is obviously not solvable by ruler and compass. 
If the équation has been set up, then we must select that one of its irre- 
ducible factors that is connected with the solution of the problem, and 
détermine whether this irreducible équation can be solved in square roots. 
As the Galois theory shows, for this it is nccessary and sufficient that the 
number m of permutations that constitute its Galois group be a power 
of 2. 

With this test we can prove the theorem stated by Gauss that a regular 
polygon with a prime number p of sides can be constructed by ruler and 
compass if and only if the prime number p has the form 2 U + 1 . i.e., 
for p = 3, 5, 17, 257 but not for p = 7, 11, 13, 19, 23. 29, 31, -, etc. 
Gauss proved only the “if" part of this assertion. 

By the same method we can prove that it is impossible to divide an 
arbitrary angle into three equal parts by ruler and compass, or to duplicate 
the cube, i.e., from the edge of a given cube to find the edge of a cube with 
twice as great a volume, and so forth. 

The impossibility of squaring the circle, i.e., of constructing with ruler 
and compass the side of a square equal in area to a circle with given radius, 
is proved in a different way. Namely, it can be shown that the side of such 
a square is not connected with the radius by any algebraic équation, i.e., 
it is so to speak transcendental relative to the radius, and consequently 
it is a fortiori not expressible in terms of the radius by a chain of square 
roots. This proof is difficult and it does not follow from Galois theory. 

Two fundamental unsolved problems connected with Galois theory. 

In Galois theory there remain two further basic problems which hâve not 
yet been solved in their general form although many excellent mathe- 
maticians hâve been working on them almost uninterruptedly. 

The first of these is the problem of the so-called Hilbert-Cebotarev 
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resolvents (not to be confused with the Lagrange resolvents) which is a 
direct generalization of the problem of solution of équations in radicals. 
The idea is this: Saying that an équation is solvable in radicals is exactly 
the same as saying that its solution is reduced to a chain of successive 
binomial équations, since the radical $ A is a root of the binomial équation 
x n = A. But it may happen that although the équation cannot be reduced 
to a chain of such simple équations as the binomial ones, it can 
nevertheless be reduced to a chain of certain other very simple équations. 
Back at the end of the 18th century, it had been shown that the general 
fifth-degree équation can be reduced to a chain of binomial équations 
together with one further équation of the form x i + x + A = 0, which, 
although not binomial, has like the binomial équations, only one para- 
meter A. 

Later on it was proved that a sixth-degree équation a ready cannot 
be reduced to a chain of one-parameter équations. For équations of any 
degree we require to solve the problem: what kind of simpler équations, 
i.e., with a minimum number of parameters, make up the chain to which 
our équation can be reduced. 

If the given équation is reduced to a chain of one-parameter équations 
of a definite type, then for each of these one-parameter équations we can 
compute a table, giving its roots as a function of its parameter. Then the 
solution of the given équation is reduced to the use of a chain of such 
tables. 

Second, a still deeper problem consists of the converse of Galois theory. 
Galois proved that the properties of the solutions of an équation dépend 
on its group. But conversely, can any group of permutations be the Galois 
group of some équation and can we set up ail the équations whose Galois 
group is a given group? 

As to the first of these two questions only partial results are known, 
although such outstanding mathematicians as Klein and Hilbert worked 
on it persistently; the first general theorems were given by the remarkable 
Soviet algebraist H. G. Cebotarev. 

The second question for the so-called solvable groups, i.e., groups 
satisfying Galois’ criterion was solved in the affirmative in recent years 
by the Soviet mathematician 1. R. Safarevià. 

§3. The Fundamental Theorem of Algebra 

ln the previous section we considered the attempts, lasting three 
centuries, to solve by radicals an mh-degree équation. The problem turned 
out to be very deep and difficult and led to the création of new concepts, 
important not only for algebra but also for mathematics as a whole. 
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As for the practical solution of équations, the resuit of ail this work was 
the following. It became clear that solution by radicals is far from being 
available for ail algebraic équations, and even when it is available, it is 
of little practical value because of its complexity, except in the case of the 
quadratic équation. 

In view of this, mathematicians long ago began to work on the theory 
of algebraic équations in three completely different directions, namely: 
(1) on the problem of the existence of a root; (2) on the problem of how 
can we learn from the coefficients of the équation something about its 
roots without solving it; for example, does it hâve real roots and how 
many; and finally, (3) on the approximate calculation of the roots of an 
équation. 

First of ail, it was necessary to prove that in general any nth-degree 
algebraic équation with real or complex coefficients always has at least 
one real or complex root. * 

This theorem, which is one of the most important in the whole of 
mathematics, remained for a long time without rigorous proof. In view 
of its importance and difficulty, it is generally called the “fundamental 
theorem of algebra,” although the majority of the methods by which it 
has been proved are as closely related to infinitésimal analysis as to algebra. 
The first proof was given by d’Alembert. One point in d'Alembert's proof, 
as was later made clear, turned out to be defective. Namely, d’Alembert 
assumed as trivial the general proposition of analysis that a continuous 
function, given on a bounded and closed set of points, has somewhere 
on the set a minimum. This is true but it had to be proved. A rigorous 
proof of this property was obtained only in the second half of the 19th 
century, i.e., a hundred years after d'Alembert's investigations. 

It is generally considered that the first rigorous proof of the funda¬ 
mental theorem of algebra were given by Gauss; however, some of his 
proofs require for full rigor no lesser additions than those required for 
d'Alembert’s proof. Today a number of different completely rigorous 
proofs of this theorem are known. 

In the présent section we consider the proof of the fundamental theorem 
of algebra based on the so-called lemma of d’Alembert, and we also give 
a complété proof of the aforementioned proposition from analysis. 

The theory of complex numbers. Before considering the proof of the 
fundamental theorem of algebra, we must first of ail recall the theory of 
complex numbers as studied in high school. The difficultés which led to 
the création of the theory of complex numbers are first encountered in 

* The point is that there exist nonalgebraic équations, for example, a‘ = 0, which 
definitely do not hâve roots, either real or complex. 
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solving quadratic équations. What should we do, if the number p 2 /4 —q 
under the square root in the formula for the solution of the quadratic 
équation is négative? There exists no real number, positive or négative, 
which is the square root of a négative number, since the square of any 
real number is either positive or zéro. 

After long doubts, lasting more than a century, mathematicians arrived 
at the conclusion that it is necessary to introduce a new form of numbers, 
the so-called complex numbers, with the following laws of operations on 
them. 

Conventionally, a number of new character is introduced: /' = V — l 
such that /*= — !, and numbers of the form a + bi are considered, where 
a and b are ordinary real numbers. The numbers a + bi are called complex. 
Two such numbers a + bi and c + di are regarded as equal, if a = c, 
b = d. The sum of two such numbers is defined to be the number 
(fl -4- c) 4 - (b + d)i , and their différence is the number (a — c) + (b — d)i. 
In multiplication we agréé to multiply these numbers as if they were 
binomials but to take into considération that i 2 = —1, i.e., 

(o 4- bi)(c + di) = ac 4- bci 4- adi 4- bdi 2 = (ac — bd) 4- (bc 4- ad)i. 

If a and b are regarded as rectangular coordinates of a point, and the 
point is associated with the complex number a + bi, then the addition 
and subtraction of complex numbers corresponds to the addition and 
subtraction of vectors, i.e., of directed segments from the origin to 
the points with coordinates (a, b) and (c, d), since in addition of vectors 
their corresponding coordinates are added. As to the geometrical meaning 

of a product in the so-called plane of 
complex numbers, we can see it more 
easily if we consider the length p of the 
vector from the origin of the coordinate 
system to the point ( x,y) (this length is 
called the modulus of the complex number 
z = x 4 - iy) and the angle <f> which the 
vector makes with the 0 *-axis (this angle 
is called the argument of the complex 
number z — x + iy); in other words, if 
we consider not the Cartesian coordinates 
Fig. I. x and y but the so-called polar coordinates 

p and <f> (figure 1). Then x = pcos<f>, 
y = p sin <f> and consequently the complex number itself can be written 
as 



x A- iy = p(cos <f> 4- / sin <f>). 
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lf 

a + bi = p,( cos <f>, + / sin </>,), c + di — p 2 (cos <f> 2 + i sin <f> 2 ), 

then 

ac — bd = ptp&QS <f> | cos <f> 2 — sin <£, sin <f> 2 ) = p,p 2 cos ( <f> , -f &), 
èc 4- ad = p,p 2 (sin <£, cos <f> 2 + cos <£, sin ^ 2 ) = P 1 P 2 sin (^1 + <^ 2 ), 

from this we see that in multiplication of two complex numbers their 
moduli P, and p 2 are multiplied, and the arguments <f> t and <f> 2 are added. 
In division, since it is the inverse operation of multiplication, one modulus 
is divided by the other, and the arguments are subtracted 


and 


p,(cos <f> l + / sin<£,) pa(cos <f> 2 + / sin <f> 2 ) 

= PiPîlcosf^, + j> 2 ) + i sin («/., + <f> 2 )) 


^coll' t / S lnÎ 4 = ~ [C0S ^ 1 ~ ^ + ' sin ^' ~ ^ 

p 2 (cos <f> 2 + / sin <f> 2 ) p 2 


ln raising to a power with positive intégral exponent n, consequently, 
the modulus is raised to the same nth power, and the argument is multiplied 
by n 

[p(cos <f> + i sin <£)]" = p"(cos n<f> + / sin n<f>). 


Conversely, taking roots 


V p(cos <f> + i sin <f) = >/p (cos — + / sin —). 

\ n n / 


<t>\ 


However, in taking roots a spécial situation arises. Let n be a positive 
intégral exponent. Then 


V p(cos <t> + / sin 4>) 

is equal to the number 

P (cos4 +/sin £) 

since raising this number to the nth power gives the radicand. 

But this only one value of the root. The point is that the complex 
number 
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where k is any of the numbers 1 , 2 , —, n — 1 , will also be an mh root of 
the number 

p(costf> -f / sin <f>). 

lndeed, according to the rule for raising to a power, if we raise this number 
to the mh power, we obtain the number 

( <y- p y [cos n (i- + ^-) + » sin n (£ + -^-)] 

= p[cos (<J> + 2kn) + / sin ( <f> + 2 Af 7 r)J, 

where the addend Ikn, because of the properties of sines and cosines, can 
be neglected, since it changes neither sine nor cosine. Thus the mh power 
of this number isalso 

p(cos <f> + / sin <f>), 

i.e., this number is 

Vp (cos <f> -f / sin <f>). 

lt is easy to see that no other complex number, besides these n numbers 
for k = 0 , 1 , 2 , —, n — 1 is an mh root of 

p (cos <f> A- i sin i£). 

Geometrically, the extraction of mh roots can be described as follows. 
The points of the complex plane corresponding to the values of the 
of the number p(cos <f> i sin <f>) lie at the vertices of the regular n-sided 
polygon inscribed in a circle drawn about the origin with radius Vp and 
so rotated that one of the vertices of this n-sided polygon has argument 
<,t>/n (figure 2 ). 

We make the following observation. If 

/(2) = 2" + CtZ— 1 + — + C„_,2 + C n 

is a polynomial in z with given real or complex coefficients c,, c 2 , —, c„ 
and we change z continuously, i.e., continuously shift the point z = x + iy 
in the complex plane, then the complex point Z = X + iY = f(z) will 
also move continuously in the complex plane. This is clear from the fact 
that if we substitute in f(z) the value of z -■ x + iy, c, = a, + b t i, 
c 2 = o 2 + b j, —, c„ = a n + b n i, and perform ail computations, we find 
that 

/(2) = X + iY, 

where 

X=P(x,y), Y = Q(x, y) 
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are nth-degree polynomials in x and y with real coefficients expressed in 
terms of a, and 6,. Under continuous change of x and y, these polynomials 

will also change continously. _ 

We also note that, since the modulus p = |/(z)| is equal to VX 2 + Ÿ 2 , 
during a continuous shift of the point z in the complex plane, the modulus 
|/(z)| will also change continuously. In other words, if the point z is 
sufficiently close to the point a then the différence |/(z)| — |/(<x)| of 
absolute values is smaller than any preassigned positive number. 



Fig. 2. Fig. 3. 


Let us also remark that the modulus of a sum of several complex 
numbers is always smaller than or equal to the sum of the moduli of these 
numbers, which is équivalent to saying that the rectilinear segment OE 
(figure 3) is shorter than or equal to the polygonal line O ABC DE, being 
equal to it if and only if ail of its segments lie on one line and in one 
direction. 

We recall finally that to say “a complex number is equal to zéro” is 
the same as to say that “its modulus is equal to zéro,” since the modulus 
p of a complex number is the distance from the origin to the corresponding 
point. 

We now apply the theory of complex numbers to the proof of the 
fundamental theorem of algebra; though it must be remarked that the 
significance of the theory of complex numbers goes far beyond the limits 
of algebra. In many parts of mathematics other than algebra, we cannot 
get along without them. In many applications, for example in the theory 
of alternating currents, numerous problems are most simply solved by 
means of complex numbers. But what is most important is the application 
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of complex numbers, or more precisely the theory of functions of a complex 
variable, to the theory of certain spécial functions of two real variables 
which are called harmonie. By means of these functions, important 
problems in the theory of airplane flight, of heat conduction in a plate, of 
plane electric fields, and of elasticity can be solved. A famous theorem 
on the lifting force on an airplane wing was obtained by the founder of 
contemporary aerodynamics, N. E. Zukovskil, through investigations of 
functions of a complex variable.* 

We now pass to the proof of the fundamental theorem of algebra. 

Theorem. Any polynomial 

f(z) = a 0 z" + a,r"-‘ + — + a n _,r + a n , 
whose coefficients 

a o • û i • ”■> °n-1 » a n 

are any given real or complex numbers, has at least one real or complex 
root. 

We will assume that the given polynomial is of degree n, i.e., that 
a 0 ^ 0 . 

The surface of the modulus of a polynomial. We consider the whole 
problem geometrically. Above each point z of the complex plane, we erect 
a perpendicular altitude t, equal in length to the modulus |/(z)| of the 
polynomial /(z) at this point z. The ends of these altitudes define a surface 
M, which can be called the modulus surface of the polynomial /(z). 
We see that this surface: (1) nowhere drops below the complex plane, 
since the modulus of any complex number (in this case, the number/(z)) 
is nonnegative; (2) for any given point z of the complex plane, the surface 
has one and only one point which either lies vertically above this point 
or else coïncides with it, i.e., the surface M extends in one sheet above the 
whole complex plane and may at some points touch the plane itself; 
(3) the surface is continuous in the sense that a continuous change in the 
position of the point z on the complex plane produces a continuous change 
in the value of t = |/(z)|, i.e., in the altitude t of points of the surface. 
(This was shown in the last subsection.) 

The fundamental theorem of algebra consists in proving that the surface 
M touches the complex point in one point at least and does not remain 
everywhere at a positive distance above it. 

On the growth of the modulus of a polynomial with increasing distance 
from the origin. We show that no matter how large a positive number G 


See Chapter IX. 
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is given, we can find a radius R such that for ail points z of the complex 
plane, lying outside of the c>rcle of radius R with center at the origin, 
the altitude t of points of the surface M above the complex plane is 
greater than G. 

For let us write the polynomial /(z) as 


a 0 z n 



a-. 

a„z 2 


+ — + 



The modulus of the expression 



«o* 2 


+ ■" + 



is not greather than the sum of the moduli of the moduli of the summands 




1 ... 1 

Qn 

aoz 

<v 2 

T I 

a 0 z” 


and, with an increase in the modulus of z, every one of these summands 
decreases, so that the sum also decreases. Therefore, for ail z whose moduli 
are greather than some number R', the modulus of this expression in 
parenthèses is smaller, for example, than £. 

But then for ail such z, the expression 




+ — + 



will hâve modulus greater than £. The modulus of the first factor a„z n is 
equal to | a 0 | • | z |", so that it increases with increasing modulus of z; 
moreover, it increases beyond ail bounds. Therefore, no matter how large 
a positive number G is given, there exists a positive number R such that 
for ail z, whose moduli are greater than R, |/(z)| = | a 0 \ - | z |" • | O | is 
greater than G. 


The existence of minima of the surface M. We will say that at a 
point a of the complex plane the surface M has a minimum if the value 
of the altitude t of the point of the surface M at this point a is smaller than 
or equal to its values at ail points of some neighborhood of the point 
a, i.e., at ail points of some circle, however small, with center at the point oc. 

Let the altitude t of the point of the surface M corresponding to the 
origin, i.e., to the point z = 0 of the complex plane, be equal to g, i.e., 
|/(0)| = g. We take G > g. Ail altitudes t of points of the surface M are 
nonnegative and continuously change during continuous movement of the 
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point z in the complex plane. The surface M has altitude i > G outside 
of a circle drawn about the origin with radius R and altitude t = g < G 
at the center of the circle. D’Alembert regarded it as an obvious consé¬ 
quence that somewhere in the interior of the circle R there is a point where 
the altitude is a minimum; more precisely, where the value of t is smaller 
than or equal to its values at ail remaining points of the circle R, i.e., 
the surface M has at least one minimum. 

The rigorous proof of the existence of such a minimum is based on the 
following axiom of continuiiy of the set of real numbers. 

If two sequences of real numbers are given: a, ^ a 2 < ••• -and 

bi > b 2 $ï ••• J; b„ ^ , such that b„ > a n for ail n and b n — a„~* 0 as 
n -* ce , then there exists one and only one real number c, such that 
a n < c ^ b„ for ail n. 

Geometrically, this continuity property means that if on the line a 
sequence of interval [a„ , 6„] (figure 4) is given, such that every successive 
interval is contained in the preceding interval, and the lengths of the 
intervals become arbitrarily small, then there exists a point c belonging 
to ail intervals of the sequence. In other words, the intervals “shrink” 
to a point, and not to “an empty place.” 

Ot a? o„ t>„ bg b, 

c 

Fig. 4. 

Since the length of the segment [a„ , b n ] approaches zéro with increasing 
n, there is only one such point c. From the property of continuity for the 
set of ail points on the number axis, immediately follows the property 
of continuity for complex numbers, i.e., for points of the plane. We give 
a geometrical formulation of this property. 

If in the plane a sequence of rectangles J, , J 2 , —, A n , — is given, with 
sides parallel to the coordinate axes, such that every rectangle is contained 
in the previous one, and such that the length of their diagonals decreases 
indefinitely, then there exists one and only one point which is contained 
in ail the rectangles of the sequence. This property of continuity of the 
plane directly follows from the continuity property of the line. For the 
proof it is sufficient to project the rectangles on the coordinate axes. 

Now it is easy to establish the so-called Bolzano-Weierstrass theorem. 

If in a rectangle an infinité sequence of points z,, z 2 , —, z„, ••• is given;— 
then in the interior or on the boundary of the rectangle there exists a point 
z 0 such that in any arbitrarily small neighborhood of z 0 , i.e., in the interior 
of an arbitrarily small circle with center at z 0 , there are infinitely many 
points of the sequence z,, z 2 , ---, z„ 
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For the proof we dénoté the given rectangle by A,. We divide it into 
four equal parts by lines parallel to the coordinate axes. At least one of 
the parts necessarily contains infinitely many points of the given sequence. 
This part will be denoted by A 2 . We again subdivide the rectangle J 2 
into four equal parts and select among them a A 3 which contains infinitely 
many points of the given sequence, and so on. 

We obtain a sequence of imbedded rectangles A t , A g , A 3 , •••, whose 
diagonals decrease indefinitely. By the continuity property we can find a 
point z 0 contained in ail these rectangles. Then this z 0 is the desired point. 
For, no matter how small a neighborhood of z 0 we take, the rectangles 
of the sequence A t , A g , A 3 , —, beginning with some one of them will be 
inside this neighborhood, as soon as their diagonals become smaller than 
the radius of the neighborhood, and any one of the rectangles contains 
infinitely many points of the sequence z,, z 2 , —, z„ , Thus the Bolzano- 
Weierstrass theorem is proved. 

Now it is easy to prove the theorem on the minimum of the modulus 
|/(z)| of a polynomial. As before, let |/(0) | = g, let G bea number greater 
than g, and let R be such that for z > R, we hâve |/(z) |> G. 

If g = 0, i.e.,/(0) = 0, then the modulus |/(z) | of the polynomial has a 
minimum at the point 0, since at ail points it is ^ 0. 

If g > 0 and |/(z)| ^ g for ail points z then |/(z) | still has a minimum 
at the point 0. Let g > 0 and let points z exist, in which |/(z)| < g; then 
in the sequence of numbers 


o ? 2 A ... n A = . 

’ n ' n ' ' n 8 


(*) 


we find the greatest c„ = (//n) g, such that ail values \f{z) | ^ c„. For the 
next number c’„ = [(/ -F 1/n) g] the sequence (*) contains at least one 
point z„ such that |/(z„) | < c'„ . 

Let n increase to infinity. For ail n we hâve | z„ | ^ R, since if | z„ | > R, 
then |/(z„)| would be greater than G and consequently greater also than g. 

Thus ail points z„ lie inside a rectangle with sides 2 R, and with center 
at the origin. It is possible that some of these points coincide. 

By the Bolzano-Weierstrass theorem there exists a point z 0 such that 
every neighborhood of z 0 contains infinitely many points of the sequence 

* ^2 » **’» Z n * *"• 

We establish that the point z 0 furnishes the desired minimum of \J\z)\. 

For at any point z we hâve 

I/O) | > c„ = c’ - f > |/0„) I - £ 

n n 

= l/Oo) I + [I/On) I - iyOo)l] - f - 



290 


IV. ALGEBRA: THEORY OF ALGEBRAIC EQUATIONS 


This inequality is valid for any n. If we take for n a sequence of values for 
which z„ indefinitely approaches z 0 , then on account of the continuity of 
|/( 2 )|, the différence |/(z„)| — \f{z 0 )\ becomes arbitrarily small in absolute 
value with gin. 

Consequently, |/(z)| ^ |/(z 0 ) I, i.e., |/(z) | actually has a minimum 
at the point z 0 . 

D’Alembert’s lemma. In view of the fact that ail the altitudes t of 
points on the modulus surface M are nonnegative, it is clear that any 
root of the polynomial /(z), i.e., any point z of the complex plane where 
the polynomial /(z) itself (and consequently its modulus |/(z) | also) is 
equal to zéro, corresponds to a minimum of the modulus surface M. 
However, as d’Alembert showed, the converse is also true: At any mini¬ 
mum the surface M extends down to the complex plane itself, and con¬ 
sequently at that point there is a root of the polynomial /(z). In other 
words, at any point at which the altitude t is positive and not zéro, there 
is no minimum of the surface M. This follows from the so-called 
d’Alembert’s lemma: 

If ais a given complex number such that /(a) = 0, then a complex number 
h can always be found with arbitrarily small modulus, such that 

\A« + h)\< |/(oc) |. 

Proof. We consider the polynomial 

/(<* + h) = a^a + h) n + a,(a 4- h) n ~ l + — 4- a„_,(a + h) + a„ 

in two indeterminates a and h and arrange it in ascending powers of h. 
In this polynomial there will be a term not containing h at ail, namely 

a<p" + a ,*"- 1 + ••• + + a n = /(a) 9 ^ 0 , 

since it was assumed that /(a) 9 ^ 0. There will also be a term with h’\ 
namely aji n , since it was assumed that a 0 9 ^ 0. As to the terms with 
intermediate powers of h, some of them, and in some cases ail of them 
may be missing. Let the lowest power of h which occurs in this polynomial 
be m, where 1 < m < n, i.e., this expression will hâve the form 

/(a 4- h) = /(a) 4- Ah” 4- Bh”+* J- Ch” - 2 4- — 4 - a Ji". 

Let us write this as: 

/(« + h) = /(a) 4- Ah” + Ah”(-!Lh + Ç-h>+- + ^ h —), 

where A 9 ^ 0, and B, C, etc., may or may not be equal to zéro. 

After this préparation the proof of d’Alembert’s lemma runs as follows. 
For h it is sufficient to take a complex number with modulus so small 
that the length of the vector Ah” is smaller than the length of the vector 
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/(oc) and with argument such that the direction of the vector Ah m is opposite 
to the direction of the vector /(oc). Then the vector /(oc) + Ah m will be 
shorter than the vector /(oc) But if the modulus of A is taken sufficiently 
small, the modulus of the expression 

d*+5**+-+>-) 

can be made arbitrarily small, for example, smaller than one, and 
consequently, the vector 

J = ^4‘ + 7* ,+ +>■-) 

is shorter than the vector Ah"' and therefore, 
the vector /(a + h) = /(oc) + Ah m + A, as is 
seen in (figure 5), is also shorter than the vector 
/(oc), even if the direction of the vector A is in 
the opposite direction of the vector Ah'". 

The details of this proof are as follows: 

1. Since in multiplication, the arguments of 
the factors are added, we hâve to take the 
argument of h such that 

arg A + m - arg h = arg/(oc) + 180°, Fig. 5. 

i.e., it is necessary to take 

arg h = arg/» - arg A + 180° 
m 



2. The modulus of 

(4 *+£*•+-+>-) 


is not greater than the sum of moduli of its summands 


T = 




+ •" + 



moreover, with decreasing modulus of A, each of the summands of this 
sum can be arbitrarily decreased and consequently so can the whole sum. 
Therefore, if A is a complex number with the above given argument, and 
A 0 is a modulus such that if A has a smallar modulus than A 0 and satisfies 
the two conditions | Ah m | < |/(a) | and T < 1, then for such A we will 
hâve |/(oc + A) ] < |/(a) which proves d’Alembert’s lemma. 
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From d’Alembert’s lemma it immediately follows thaï every minimum 
of the modulus surface M of the polynomial /(z) gives a root of this 
polynomial, lndeed, if at the point a, /(a)^0, then by virtue of 
d’Alembert’s lemma at arbitrarily close points a + h we would hâve 
|/(<x + h) | > |/(a)|, i.e., there would not exist a circle with center at a, 
at ail of whose points the modulus of f(z)is not smallerthan the modulus 

of /(a), and therefore at the 
point a we would not hâve a 
minimum of the modulus of 
/(z). With this the fundamental 
theorem of algebra is proved. 

The general form of the modu¬ 
lus surface M. The modulus sur¬ 
face M of the polynomial /(z) 
lies above the complex plane z. 
lt has the form shown in 
figure 6. lt can be shown that 
at greater altitudes t, the sur¬ 
face M differs very little from 
the surface obtained by revol- 
Fig. 6. ving the mh degree parabola 

l = \ a 0 \x n about the Or-axis. 
But for small r the surface M has minima, whose number is equal 
to the number of distinct roots of the équation /(z) = 0. At ail these 
minima the surface M touches the complex plane z itself. 

§4. Investigation of the Distribution of the Roots of a Polynomial 
on the Complex Plane 

A number of problems important in practice are connected with this 
question: Without solving a given équation, obtain some information 
about the distribution of its roots on the complex plane. The first such 
problem, historically, was to détermine the number of real roots of an 
équation. That is, if an équation with real coefficients is given, then by 
some test depending on its coefficients, to détermine, without solving the 
équation, whether it has real roots and if it does, how many; or how many 
positive and how many négative roots it has; or how many real roots 
lying between given limits a and b. 

Dérivatives of a polynomial. In this section an essential rôle will be 
played by the dérivative of a polynomial. The définition of the dérivative 
of a function was given in Chapter 11. 
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For the polynomial a Q x n + a,*"- 1 + ••■ + a„_ X x + a„ the dérivative is 
given, as is well known, by the polynomial 

na 0 x n - 1 + (n - l)a,*"- 2 + ••• + a„_,. 

The concept of dérivative in Chapter II wasconsidered only for functions 
of a real variable. In algebra it is necessary to consider the variable as 
taking on arbitrary complex values and to introduce polynomials with 
complex coefficients. 

However, the former définition of dérivative can be retained, namely 
as the limit of the ratio of the incrément of the function to the incrément 
of the independent variable. The formula for computing the dérivative of 
a polynomial with complex coefficients, and the basic laws for the dériv¬ 
ative of sum, product, and power remain the same as before.* 

Simple and multiple roots of a polynomial. In §2 of this chapter it 
was established that if the number a is a root of the polynomial f(x), 
then f(x) is divisible by x — a without remainder. If /(x) is not divisible 
by ( x — a) 2 , then the number a is called a simple root of the polynomial 
f(x). Generally, if the polynomial f(x) is divisble by (x — a) k but not by 
(x — a) k + l , then the number a is called a root of multiplicity k 

A root a of multiplicity k is often regarded as k different roots. The 
basis for this is that the factor (x — a) k , présent in the factorization of 
/(x) into linear factors, is the product of k factors, each equal to (x — a). 

By virtue of the fact that every polynomial of degree n can be factored 
into the product of n linear factors, the number of roots of the polynomial 
is equal to its degree, if we take into account the multiplicity of each root. 

The following theorems are true: 

1. A simple root of a polynomial is not a root of its dérivative. 

2. A multiple root of a polynomial is a root of its dérivative of 
multiplicity one less. 

For, let /(x) = (x — a) k f x (x) and let /,(x) not be divisible by (x — a), 
i.e.,/,(a) 0. Then 

f\x) = k(x - a) k ~'f,(x) + (x- a) k f 'i(x) 

= (x - a) k -> [kf,(x) + (x- a)/,'(*)] = (x - a) k ~> F(x). 

The polynomial F(x) — kf t (x) + (x — a)fj(x) is not divisible by 
(x — a), since F(a) = kf,(a) 0. 


See Chapter IX. 
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Consequently,/■'(*) for A: = I is not divisible byx— a, and for k > \f\x) 
is divisible by (x — a)* -1 but not by (x — a) k . With this both theorems 
are proved. 

Rolle’s theorem and some of its conséquences. According to the well- 
known theorem of the Rolle*, if the real numbers a and b are roots of a 
polynomial with real coefficients, then there exists a number c lying 
between a and b which is a root of the dérivative. 

From Rolle’s theorem the following interesting theorems follow: 

1. If ail roots of the polynomial f(x) = a„x n 4- + a„ are real, then 

ail roots of its dérivative are also real. In addition, between two adjacent 
roots of f(x) there exists one root of f'(x) and this root is simple. Indeed, 
let x t < x 2 — < x k be the roots of f(x) with multiplicities m,, m 2 , —, m k , 
respectively. Clearly, m, + m 2 + ••• + m k = n. 

Then the dérivative f'(x), by the above theorem on multiple roots, will 
hâve roots x t , x 2 , —, x k with multiplicities m, — 1, m* — 1, —, m k — 1, 
and by Rolle’s theorem there is at least one root y ,, y 2 > Yt-i * n the 
interior of each of the intervals (x„ x 2 ), (x 2 , x 3 ), (x k _,, x k ) between 

two successive roots of /(x). Thus, the number of real roots of /'(x) is 
equal (with regard to multiplicities) to at least (m, — 1) 4- (m 2 — 1) 4- ••• 
+ (m t — l) + fc — l=n — l. But f'(x) as an (n — l)th-degree poly¬ 
nomial has (with regard to multiplicities) n — 1 roots. Consequently, ail 
roots of /'(*) are real, y t ,y 2 , ■■■, y k - t are simple roots, and roots other 
than x lt x 2 , —, x k . and y t , y 2 , —, of the polynomial /'(*) do not 

exist. 

2. If ail roots of a polynomial f(x) are real and of these p are positive, 
then f'(x) has p or p — 1 positive roots. 

For, let x, < x 2 < < x k . be ail positive roots of the polynomial 

f(x) with multiplicities m k ,m 2 , —, m k , respectively. Then m x + m 2 4- ••• 
4 - m k = p. The dérivative f'(x) will hâve the following positive roots: 
x x , x 2 , —, x k with multiplicities m, — 1, m 2 — 1, ■■•,m k — I; simple 
roots y, , y 2 , —, y k _ x lying in the intervals (x,, x 2 ), —, (**_, , x k ); and it 
can also hâve a simple root y 0 lying in the interval (x 0 , *,) where x 0 is the 
largest nonpositive root of f(x). Consequently, the number of positive 
roots is equal to (m, — 1)4- ••• 4- (w* — 1) 4- A: — l=p — 1 or 
(m, — 1) 4- ••• 4- (m k . — 1)4- (A: — 1)4-1 = P which was required to be 
proved. 

Descartes’ law of signs. In his significant book of 1637 “Geometry,” 
in which the first présentation of analytic geometry was given, Descartes, 

* This theorem is the simplest form of the mean value theorem, which was mentioned 
in Chapter II. 
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among other things, gave the first significant algebraic theorem concerning 
the distribution of roots of a polynomial on the complex plane, the so- 
called “Descartes law of signs.” It can be stated as follows: 

If the coefficients of an équation are real and ail its roots are a/so known 
to be real, then the number of its positive roots, with account taken of 
multiplicities, is equal to the number of changes of sign in the sequence of 
its coefficients. If it also has complex roots, then this number is equal to or 
an even number less than the number of these changes In sign. 

We first explain what we mean by the number of changes of sign in the 
sequence of coefficients of the équation. To obtain this number we write 
down ail coefficients of the équation, for example in the order of decreasing 
powers of the unknown, including the coefficient of x” ànd the constant 
term, but omitting coefficients equal to zéro, and consider ail pairs of 
successive numbers of the sequence so obtained. If in such a pair the signs 
of the numbers are different, then we call this a change of sign. For 
example, if the given équation is 

x 7 + 3X 6 - Sx* - 8x 2 + 7* + 2 = 0 
then the sequence of its coefficients is 

1,3, -5, -8, 7,2 

and there are 2 changes of sign. 

Now we pass to the proof of the first part of the theorem.* 

Without loss of generality we can assume that the leading coefficient a 0 
of the polynomial /(x) = o^x" + ••• + a„ is positive. 

First of ail, we establish that if /{x) has only real roots and of these p 
are positive (counting multiplicities) then (—1)" is the sign of the last 
coefficient of/(x) different from zéro. 

lndeed. let 

f(x) = <vr" + - + a k x n ~ k 

= a 0 x”~ k (x - x,) ••• (x - Xp) (x - Xp+i) (x - x„_ fc ), 

where x,, —, x p are the positive roots of /(x), x st ,, —, x„_ k are the 
négative roots of/(x), account being taken of the multiplicity of each root. 
Then a k = a 0 (— lj^x, x p (—x, M ) (—x„» t )and, since ail the numbers 
numbers a 0 ,x t , x v , — x ptl , •••, —x n _ k are positive, the sign of a k is 
(-l)’.t 

The subséquent proof is based on the method of mathematical induction. 

* We could give another, direct proof, not involving dérivatives, but it would be 
somewhat longer. 

tWe note that this assertion is also correct for the case when some of the roots of 
f(x) are complex. 
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For first-degree polynomials the theorem is trivial. Indeed, a first-degree 
polynomial a ,>* + a, has a unique root —aja 0 , which is positive if and 
only if a 0 and a, hâve opposite signs. 

Let us assume now that the theorem is proved for ail polynomials of 
(n — l)th degree with real roots, and with this assumption we will prove 
it for any polynomial f(x) = a qX" + — + a„~ t x + a„ of degree n. 

1. a n = 0. We consider the polynomial /,(.*) = a^"' 1 + ••• 4- a„_,. 
The positive roots of the polynomials f(x) and /,(*) are the same; the 
number of changes of sign in the sequence of their coefficients is also the 
same. For the polynomial /,(.*) Descartes’ law is valid; consequently it 
is valid for the polynomial f(x). 

2. a„ 0. We consider the dérivative 

f'(x) = naox + (n - 1 ) a.jr"- 2 + - + a n _,. 

It is clear that the number of changes of sign in the sequence of coef¬ 
ficients of the polynomial f(x) is equal to the analogous number for the 
dérivative /'(*), if the signs of a n and the last nonzero coefficient of the 
dérivative coincide, or it is one more, if the signs are opposite. 

By what was said above at the beginning of the proof, in the first case 
the number of positive roots of /(*) and of f'(x) hâve the same parity 
(are both even or both odd), and in the second case they hâve opposite 
parity. But as we deduced from Rolle's theorem, the number of positive 
roots of a polynomial, if ail its roots are real, can be either equal to the 
number of positive roots of its dérivative, or be one more. Taking this 
into considération, we note that in the first case f(x) has the same number 
of positive roots as /'(*), and in the second case one more. For f\x) 
Descartes' law is valid by the induction assumption, i.e., the number of 
positive roots of f'(x) is equal to the number of changes of sign in the 
sequence of its coefficients. Consequently, in both cases the number of 
positive roots of /(*) is equal to the number of changes of sign in the 
sequence of coefficients, and this is the required proof. 

The second part of Descartes’ law is not more complicated to establish, 
and we will omit the proof here. 

Remark 1. The first assertion of Descanes’ theorem is particularly 
important, since in many practical problems it is automatically known 
whether ail roots of a given équation are positive. In this case it can be 
quickly determined, how many roots are positive and how many négative. 
Also it can be seen at once, how many zéro roots the équation has. 

Remark 2. If in the given polynomial we set x ■■= y + a where a is 
an arbitrary given real number, i.e., we form the polynomial f[y x a). 
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then the positive roots y of this polynomial will be those and only those 
that are obtained from the roots x of the given polynomial f(x ) that are 
greater than a. Therefore the number of roots of the given polynomial 
f(x), ail of whose roots are real, lying between given limits a and b (b > a), 
is equal to the number of changes of sign for the polynomial f(y + a) 
minus the number of changes of sign for the polynomial /(z 4- b). If, 
however, not ail roots of f(x) are real, then it can be shown that this 
number is equal to this différence or some even number less. This is the 
so-called Budan theorem. 

Sturm’s theorem. Descartes’ law of signs, as well as Budan’s theorem 
do not, however, give an answer to the problem: Does a given équation 
with real coefficients hâve at least one real root, how many real roots does 
it hâve altogether, and how many real roots does it hâve lying between 
given limits a and b ? For more than two centuries mathematicians 
attempted to solve these problems but without resuit. A long sériés of 
efforts in this direction were made by Descartes, Newton, Sylvester, 
Fourier, and many others, but they did not succeed in solving even the 
first of these problems, until, finally in 1835 the French mathematician 
Sturm suggested a method that soived ail three problems. 

Sturm’s method is really not very complicated, but it is of such a 
character that one might seek it for a long time and not find it. Sturm 
himself was very happy that he had succeeded in solving this remarkable 
and exceedingly important pratical problem of algebra. In his lectures, 
when he came to the présentation of his resuit, he usually said: “Here is 
the theorem whose name I bear.” But it must be said that Sturm did not 
solve this problem by mere chance; he pondered for many years on 
questions related to it. 

Let/(z) be a polynomial with real coefficients and/,(z) be the dérivative 
/'(z). Let us divide the polynomial /(z) by /,(z) and dénoté the remainder 
in this division by / 2 (z), taking it with the opposite sign. Then, divide 
/i(z) by /afz) and dénoté the remainder, taken with opposite sign, by 
/aW. etc. 

It can be shown that the last nonzero polynomial/,(z) of the constructed 
sequence will be a constant number c. 

Sturm’s theorem is as follows: If a < b are two real numbers, which are 
not roots of the polynomial /(z). then substituting in the polynomials 

/ïz),/,(z), —,/_i(*),c 

z = a and z — b, we obtain two sequences of real numbers 

A a )J\( a )JAP)* •••,/,- l (a), c, (I) 

Ab),m,féJ», -,/,-i(6), c, (II) 
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such that the number of changes of sign in sequence (1) is greater than or 
equal to the number of changes of sign in sequence (11) and the différence 
between these numbers of changes of sign is exactly equal to the number 
of real roots of/(z) lying between a and b, or in other words, the number 
of these roots is equal to the loss of changes of sign in sequence (1) in 
going from a to b. 

The proof of Sturm’s theorem is not more difficult than the proof of 
Descartes' theorem, but we will not give it here. 

Sturm’s theorem enables us to compute the number of roots of a poly¬ 
nomial with real coefficients on an arbitrary of the real axis. Therefore 
the application of Sturm's theorem to any given polynomial gives a clear 
picture of distribution of roots of a polynomial on the real axis, in partic- 
ular it enables us to separate the roots, i.e., to construct segments in each 
of which only one root of the polynomial is contained. 

In many applications, the solution of the analogous problem for the 
complex roots of a polynomial is equally important. Since complex 
numbers are represented by points not on the line but in the plane, it is 
impossible to speak of “segments" in which complex roots are contained; 
instead of a segment, we hâve to consider a région, i.e., a part of the plane, 
chosen in one way or another. 

Thus, with respect to complex roots the following problem arises: 

Given a polynomial/(z) and a région in the complex plane, it is required 
to find the number of roots of the polynomial inside this région. 

We assume that the région is bounded by a closed contour (figure 7) 
and that on the contour the polynomial /(z) does not hâve roots. 

Imagine that the point z goes around the contour of the région once in 
the positive direction. Every value of the polynomial is also represented 
by points on the plane. With continuous change of z the polynomial /(z) 
also changes continuously. Therefore, while z goes once around the 




Fig. 7. 


Fig. 8. 
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contour of the région, f{z) describes some closed curve. This curve will 
not go through the origin of the coordinate System, since/(z) by assumption 
does not reduce to zéro at any of the points of the contour (figure 8). 

The answer to the above mentioned oroblem is given by the following 
theorem: 

Principle of the argument. The number of roots of the polynomial 
J\z) inside the région bounded by a closed curve C is equal to the number 
of times the point f(z) winds around the origin as z goes around the contour 
C once in the positive direction. 

For the proof we décomposé /(z) into linear factors 

/(z) = floz" + fl,z"-* + — + a n = o 0 [z - z,) (z - z 2 ) — (z - z n ). 

We know that the argument of the product of several complex numbers 
is equal to the sum of the arguments of the factors. Consequently, 

arg/(z) = arg a 0 + arg (z - z,) + arg (z - z 2 ) + ••• + arg (z - z„). 

Let us dénoté by A arg /(z) the incrément of the argument of /(z), 
computed under the assumption that z goes once around the contour C. 
It is clear that A arg /(z) is 2 ir multiplied by the number of times the point 
/(z) winds around the origin. 

Clearly, 

A arg /(z) = A arg a„ + A arg (z - z,) 

+ A arg (z - z 2 ) + — + A arg (z - z„). 

lt is clear that A arg a 0 = 0 since a 0 is a constant. Then, z — z, is 
represented by the vector going from the point z, to the point z. Let us 
assume that z, is in the interior of the région. Geometrically it is clear 
(figure 9) that as the point z goes around the contour C the vector z — z, 
makes a complété révolution about its initial point, so that 
A arg (z — z,) = 2 -n. We assume now that the point z 2 is in the exterior 
of the région. In this case the vector “oscillâtes" to one side and back, 
and returns to its original position without making a révolution about 
its initial point, so that A arg (z — z 2 ) = 0. We can reason the same way 
about ail the roots. Consequently A arg /(z) is equal to 2rr multiplied by 
the number of roots of /(z) lying in the interior of the région. Hence the 
number of roots of J\z) inside the région is equal to the number of times 
the point /(z) winds around the origin, and this is the required proof. 

This theorem enables us to solve the problem in every parlicular case, 
and to draw the curve traced by the point /(z) with any degree of accuracy. 
To do this it is necessary to take a sufficiently dense set of points z on the 
contour C, to compute the corresponding values /(z) and to join them by 
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a continuous curve. However, in some cases we can get by without these 
tedious computations. We indicate one of the methods with a numericai 
example. 

Example. Let us find the number of roots of the polynomial 
/(z) = z 11 -f 5z 2 — 2 inside a circle of radius 1 with center at the origin. 

On the indicated circle | z | = I, one of the three terms which make 
up the polynomial /(z), namely 5z 2 , dominâtes the others. Indeed, 
| 5z 2 1 = 5, but | z" — 2 | < | z | n + 2 = 3. This property allows us to 
reason thus. Let us dénoté z" + 5z 2 — 2 by w, 5z 2 by N t , and z 11 — 2 by 
N 2 . While the point z goes once around the unit circle, N t = 5z 2 winds 
around a circle of radius 5 twice, since | N t | = 5 and arg N l ~ 2 arg z. 
The point w is “tethered" to the point jV, by a vector whose length is 
| N 2 1 sg 3, i.e., the distance from the point w to the point A' 1 is at ail 
times smaller than the distance from jV, to the origin of the coordinate 
System. 



Consequently, the point *v, however it may “wind" around N t (figure 10) 
cannot “independently" go around the origin, and therefore winds around 
the origin exactly as many times as the point N t does, i.e., twice. Con¬ 
sequently, the number of roots of J{z) in the interior of the région in 
question is equal to two. 

Hurwitz’s problem. In mechanics, particularly in the theory of 
oscillations and control, an important rôle is played by the conditions 
that permits us to décidé whether ail the roots of a given polynomial 
/(z) = fl„z n + fliZ’ 1-1 + +a„ (with real coefficients) hâve négative real 

parts, i.e., lie in the half plane left of the imaginary axis. 

One of the criteria for solving this problem is easy to obtain from reasons 
similar to the principle of the argument. 

We will assume that a 0 > 0. 
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Let the point z (figure 11) move on the imaginary axis downward from 
above, i.e., let z — iy as y changes from + co to — co, remaining real. 
Then f(z) describes a curve with infinité branches. For our investigation 
the closely related curve described by the function 

/i( z ) = (i)~*Àz) = a oy" ~ a zy" 2 + <**y n ~* +- K‘>iy n ~' - a ^- 3 + •••) 

= <Ky) - î>P(y)> 

where 

<Ky) = ûo y n — o 2 y n ~ 2 + —, 

<l>(y) - - a^y "- 3 + ••• 

is more convenient. 

Since arg / = 7 t/ 2, therefore arg/,(z) = — nn/2 + arg/(z), and conse- 
quently the incréments of the arguments of/(z) and /,(z) are the same. 

Let us compute the incrément of the argument of the point /,(z) as z 
moves on the imaginary axis downward. 

Let /(z) = Oq(z - z,) (z - z,) - (z - z„). Then 


arg/,(z) = 

arg (a 0 i n ) + arg (z - z,) + arg (z - z 2 ) + • 

•• + arg (z - z n ). 

It is clear geometrically that the incrément of arg (z — z*) is equal to n, if 
z k lies in the right half plane and to —n, if z lies in the left half plane 

(figure 11). 
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Therefore the incrément of the argument of/,(z) is equal to tt(N 1 — N 2 ). 
where N t is the number of roots of /(z) in the right half plane and N 2 is the 
number of roots in the left half plane. For ail the roots to lie in the left 
half plane it is necessary and sufficient thaï the incrément of the argument 
of the point f t (z) be equal to — irn, i.e.. that the point /,(z) make n half 
révolutions clockwise about the origin (figure 12). 

We note that the point f x (z) = <^y) — iipiy) intersects the imaginary 
axis for those values of y that are roots of <j>(y) and the real axis for the 
roots of ifi(y). Since <f^y) has no more than n real roots and the number of 
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real roots of 4>(y) is not more than n — 1, it is easy to see geometrically 
that /,(z) can make n complété révolutions in the clockwise direction if 
and only if the curve cornes from the fourth quadrant and intersects in 
turn the négative half of the imaginary axis, the négative half of the real 
axis, the positive half of the imaginary axis, the positive half of the real 
axis, etc., so that the general number of points of intersection with the 
imaginary axis is equal to n (one for every half révolution), and with the 
real axis it is equal to n — 1 (one less than the number of half révolutions). 
Therefore the coefficient a, must be positive, and the roots of the poly- 
nomials <f>(y) and <fi(y) must be ail real and alternating. This last statement 
means that if y, > y 2 > ••• > y„ are the roots of <f>(i/i) arranged in 
decreasing order, and rj, > rj t > ■■■ > rj n _, are the roots of i/i(y), then 

y, > Vi > > ^ > V2 > - > y»- 1 > vn-i > y n ■ 

Thus, in order that ail roots of the polynomial f{z)>z n +a x z n - x +'"+a„ 
with real coefficients and a„> 0 lie in the left half plane, it is necessary 
and sufficient that the coefficient a, be positive and the roots of the 
polynomials <f>(y) = a 0 y n — a^y"-* + a«y n ~ 4 — ••• and tfj(y) = 
fl, y n ~ l — a 3 y n ~ 3 + ••• be ail real and alternating. 

This condition is équivalent to the well-known condition of Hurwitz 
to the effect that ail the following déterminants are positive: 






û. 

flo fl-, 

" a 2-n 

a\ . 

fl, 

a, a 0 

a 3 fl 2 ’ 3 

fl 0 fl-, 
fl 2 fl, , ■ 

fl 4 fl 3 

"* 

a* 

a 2 fl, 

“ a*- n 


a 2n-l fl2n-2 Û 2 n -3 a n 


where ail a, with indices less than 0 or greater than n are replaced by zéro 
(on déterminants, see Chapter XVI, §3). 


§5. Approximate Calculation of Roots 

Sturm's method in combination with the lower limit of the différence 
of two distinct real roots allows us to construct the “séparation" of real 
roots of a polynomial with real coefficients, i.e., allows us to détermine 
for each root limits a and b between which only this one root can be found. 
It remains to discover a suitable method for finding, in the segment 
a < b, numbers a, < a 2 < a 3 < ••• and /S, > & > )S 3 > •••, which 
converge as rapidly as possible to the desired root, the first sequence 
being an approximation by defect and the second by excess. Each of the 
two approximations a k and /S*, clearly differs from the desired root x by 
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less than their différence 0* — a k , since the root lies between them. Thus 
we can find upper bounds for the error when we stop at any given 
approximation. 

Graph of a polynomial. Let the given nth-degree polynomial with 
real coefficients be 

f(x) = aoX n + a,*"- 1 + — + a n _ t x + a„. 

Let us consider the curve that represents in rectangular coordinates 
the équation y — j\x), i.e., the graph of this polynomial This curve is 
sometimes called an nth-order parabola. First of ail, it is clear that for any 
real x there is one and only one definite y = /(*); consequently, the graph 
/ranges arbitrarily far to the right and to the left. In addition, for contin¬ 
uons change of x,f(x) as well as f\x) change continously, i.e., without 
jumps. Therefore, the graph /is a smooth curve. For x large in absolute 
value the first term a^x" exceeds in absolute value the sum of ail remaining 
terms, since they are ail of lower degree. From this it follows that if n is 
even and a 0 > 0, then the graph / on the right and on the left goes to 
infinity upward (and if a 0 < 0 , downward); but if n is odd and a 0 > 0 , 
then on the right it goes upward and on the left downward (if a 0 < 0, 
then conversely). 

The points of intersection of the graph / with the 0*-axis, i.e., those 
points where y — f(x ) = 0, correspond to the real roots of the équation 
f(x) = 0; there are no more than n of them. At the maxima and minima 
of the graph y = /(*), the dérivative f\x) = 0; consequently, the number 
of maxima and minima is not greater than n — 1. If on some section 
/"(*) > 0, the first dérivative increases there, i.e., the graph is concave 
upward; if/"(xr) < 0, then the graph is concave downward. Because some 



Fig. 13. 


Fig. 14. 
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Here are examples of the graphs of polynomials 

f(x) = x 3 — 3x -T I (figure 13), 
f(x) = x* — x 3 — 4x 2 + 4x + 1 (figure 14). 

After constructing the graph of a polynomial it is easy to find approxi¬ 
mations to its roots. Namely, the roots are the abscissas of the points of 
intersection of the graph with the Ox-axis. 

The method of "undershot” and “overshot.” Let us substitute in the 
polynomial /(x) some intégral rational number, for example 3, and then 
substitute 4, 5, •••. If in substituting 4, 5, 6 we still obtain the same sign 
as for 3, but for 7 the opposite sign, then it is clear that between 6 and 7 
the polynomial f(x) has at least one root. Now we substitute 6 ,6.1,6.2, ••• 
and find two neighbors of this sequence of numbers, for example 6.4 and 
6.5 which when substituted give different signs. Accordingly, there will 
be at least one root between them. Then we substitute 6.4, 6.41, 6.42, 
6.43, ••• and find even doser limits for the root, for example, 6.42 and 
6.43, etc. This is the method of “undershooting and overshooting.” The 
method can be considerably simplified by applying at each step of the 
calculations a supplementary transformation of the polynomial, and then 
at each step after the first, it will be necessary to substitute only whole 
numbers and not fractions, and moreover, only the whole numbers 1 , 2 , •••, 
9. But we will not dwell on this simplification. 

The method of tangents and the method of chords. The method of 
tangents, called Newton’s method, and the method of chords, or of linear 
interpolation, called also the method of false position (régula falsi), are 
used either separately or together to obtain estimâtes of error. Suppose 
a < b and a and b we hâve only one root of the polynomial f(x), so that 
f(a) and f(b) are of opposite sign, and let us also suppose that the second 
dérivative f"(x) between a and b is of constant sign. In this case the part 
of the graph of f(x) between a and b has one of four forms (figure 15). 

In cases I and II in figure 15, the tangent to the graph at the point with 
abscissa a intersects the 0 *-axis at a point with abscissa «i lying between 
the desired root and a. If we calculate the abscissa a, and consider now 
the tangent to the graph from the point with abscissa a,, we analogously 
find a point a 2 lying between the point a, and the desired root, and then 
find a corresponding 0:3 and so on. In this way we will obtain better and 
better approximations with defect. As can be seen from the diagram, 
these values approach the desired root with great rapidity. 

In cases III and IV it is necessary, on the other hand, to start with the 
abscissa b , and then obtain points & , & , j 8 3 , •••, i.e., better and better 
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approximations with excess. Which of the four cases actually occurs, is 
easy to détermine by the signs oîf(a),f(b), and /"(*) for a < x < b. 

Since the équation of the tangent to the curve y = f(x ) at its points with 
abscissa a is 

y ~AO - a). 






the abscissa a, of the point of its intersection with the Ox-axis is obtained 
from the equality 

0 -/(a) =/'(a)(a, - a) 

that is 


Then 




= a — 


AO 

no ■ 


“2 = “1 - 


/(*,) 

/'(«,) ’ 


<x 3 = a 2 — 


A« 2 ) 

/'(“a) 


and so on. 
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Analogously, 


o t Ab) o o AP .) 

_ /w 


. =& 


AP 2) 

/'(/S*) 


and so on. 

This is Newton’s method.* 

The method of linear interpolation or false position, consists of the 
following. The équation of a chord, as the équation of a line passing 
through two given points, has the form 


x - a _ y - J\a) 

b-a " f(b) - f{a) ’ 

and the abscissa y, of the point of its intersection with the 0 *-axis, as 
obtained from the équation 


x — a 0 — / (a) 

b - a T(b) -/(g) 

is equal to 

_ (b - a)A a ) , _ aAb) - bf( a) 

y ' Ab) ~A°) Ab) -A°) ' 

Taking this number for the new b in cases I and II and for the new a in 
cases III and IV, we find in cases I and II 

v _ g/tVi) ~ Yj(a) _ aAYÙ ~ YzÂa) 

n Ayi) -A«) Ay%) -/(«) 

and so forth. 

In cases III and IV, taking y t for the new a we find 

„ _ Yifib) - bfiyi) _ YzAb) ~ bflyù 

y% Ab)-Av\) ’ 73 Ab)-Avt) 


and so forth. 

The combination of these two methods is particularly important, since 
(as may be seen from the diagrams) it allows us, if the approximations 
from above and below are known, to estimate the error, which is clearly 


* From these formulas we also obtain a rigorous proof of the two assertions made 
from a considération of the diagrams. Namely, the values a. (or /3„) with increasing n 
change monotonically, for example in case I they increase and are bounded, i.e., by 
virtue of the Weierstrass lemma, they approach some limit a. Replacing a„ in these 
formulas by its limit a, we obtain a — a — [ /(“)// (“)] from which /(a) = 0 , i.e., 
a is a root of f. 
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not greater than the différence between these approximations, since the 
desired root is between them. 

Remark. It is important to note that the fact that f(x) is a polynomial, 
and not some other function of x, does not play any rôle at ail either in 
Newton’s method, or in the method of linear interpolation, i.e., both 
of these methods and their combination can be adapted, under the 
aforementioned conditions, to transcendental équations. 

Lobacevskii’s method. One of the most widely used methods of 
calculation of roots, especially of complex roots, is the method* proposed 
by N. I. Lobaéevskil in his book “Algebra,” published in 1834. The basic 
idea of this method goes back to Bernoulli. 

We note, first of ail, that if we are given a polynomial whose roots are 
x x , x 2 , •••, x n , then it is easy to write down the polynomial, also of the 
nth-degree, whose roots are x[ , x\ , •••, x$ , i.e., the squares of the roots 
of the given polynomial. Indeed, if , x 2 , x n are the roots of the 
polynomial 

x " + fl,*"- 1 + a 2 x n ~ î + ••• + a n , 
then it may be written as 

(x - *,)(* - x 2 ) •••(* - x n ), 

and the polynomial 

x" - a,*"-* + a 2 x n - 2 -± a n , 

whose roots are the roots of the given polynomial taken with opposite 
sign, may be written as 

(x + *,) (x + x 2 ) — (x + *«). 

The product of these two polynomials is consequently 

(*»-*&(*»-•••(*»-**) 

and therefore contains only even powers of x. Setting x 2 = y, we obtain 
an nth-degree polynomial in y 

y" + V" 1 + biy”-* + — + b n , 

which may be written as 

(y — •*»)Ck — 4) -(.y - 4). 


•This method was discovered independently by Dandelin (1826), N. I. Lobaievskii 
(1834), and Graeffe (1837). 
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since its roots are x \, x\, —, x\. Instead of directly multiplying the 
polynomial 

x n + a,x n -' + a 2 x n ~ 2 + •■• + a„ 


by the polynomial 


x n - û 1 x n ~ l + a 2 x n ~ 2 - ± a n , 

we can obtain the coefficients b k according to the following scheme. In 
first row above a horizontal line, we write 1 , a,, a 2 , a„ and then below 
the line, under each of these coefficients a k , we write first its square a*, 
then minus twice the product of its neighbors 


2a«,-i a k , t , 


then plus twice the product of the coefficients 

+ O ki .2 , 

symmetric with respect to a k , etc., alternating in sign until ail further 
coefficients on one side or the other are equal to zéro. The coefficients b k 
are then obtained as the sum of the corresponding columns of numbers 
written under the line. 

After obtaining these coefficients 1, 6, , b 2 , •••, b„ of the polynomial 
whose roots are I, x\ , x* 2 , •••, x 2 n , we next construct the coefficients I, 
c, , c 2 , •••, c n of the polynomial whose roots are the squares of the roots 
of the polynomial 


y" + V ' 1 + b^ 2 + ••• + b „, 

i.e., x\,x* 2 , ■•■,x i n . Then analogously we obtain the coefficients 1, d,, 
d 2 . •••, d n of the polynomial whose roots are , •••, x J; and then the 

polynomial whose roots are x| 6 , x x 2 , —, x 1 *. and so forth. 

Let us consider only the fundamental idea of Lobafevskil’s method; 
moreover, we restrict ourselves for simplicity to the case when ail roots 
of the équation are real and distinct in absolute value. Let 

I *1 I > I *2 I > — > I X„ I, 

i.e., let x, be the root largest in absolute value, x 2 the next largest, and so 
on. Let N be a sufficiently large number and let the polynomial 


A'" + A x X”- 2 + A 2 X”~ 2 - + A 
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hâve roots equal to the A'th power of the roots x t , x t , •••, x n of the given 
polynomial, i.e., 

~A, = X? + + - + Jtjf, 

^2 = « + « + - + 

±^=«” 

Then in the sequence of numbers | x* |, | x% |, •••, | x% \ for large N each 
sucessive number is so much smaller than its predecessor that in these 
expressions for A, , A 2 , A n we may retain only the first summand, the 
sum of ail remaining summands being neglected in comparison with the 
first. We thus obtain the approximate formulas 

X* as — A |, x*x£ ^ A ,, 

±A n , 

or, dividing pairwise and taking the A'th roots, we hâve the following 
formulas for x k .: 


— — x — N / — — ••• x — 

A, ’ V A 2 ’ ,Xn 

It can be shown that it is sufficient to extend the computation up to the 

polynomial whose coefficients taken with signs H-H-••• will be 

equal with the necessary degree of exactness, to the squares of the cor- 
responding coefficients of the preceding polynomial. 

A detailed exposition of Lobaievskifs method can be found in the well- 
known book of Academician A. N. Krylov “Lectures on approximate 
calculations.” 
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CHAPTER 


V 


ORDINARY 
DIFFERENTIAL EQUATIONS 


§1. Introduction 

Examples of differential équations. The équations that we hâve 
encountered up to now hâve been for the most part concerned with 
finding the numerical value of one magnitude or another. When, for 
example, in the search for maxima and minima of functions, we solved 
an équation and found those points for which the rate of change of a 
function vanishes, or when in Chapter IV we considered the problem of 
finding the roots of polynomials, we were in each case looking for isolated 
numbers. But in the applications of mathematics there often arise problems 
of a qualitatively different sort, in which the unknown is itself a function, 
a law expressing the dependence of certain variables on others. For 
example, in investigating the process of the cooling of a body, our task 
is to détermine how its température will change in the course of time; to 
describe the motion of a planet or a star we must détermine the dependence 
of their coordinates on time, and so forth. 

We can quite often construct an équation for finding the required 
unknown functions, such équations being called functional équations. 
The nature of these may, generally speaking, be extremely varied; in fact, 
it may be said that we hâve already met the simplest and most primitive 
functional équations when we were considering implicit functions. 

The problem of finding unknown functions will concern us in Chapters 
V, VI, and VU. In the présent chapter, and in the following one, we will 
consider the most important class of équations serving to détermine such 
functions, namely differential équations-, thatis, équations in which notonly 
the unknown function occurs, but also its dérivatives of various orders. 
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The following équations may serve as examples: 

f + PQ) x = Q(t), g + nPx = A sin a,/, ^ = tx, 
du d 2 u (Pu _ (Pu d 2 u d 2 u _ 

ln the first three of these, the unknown function is denoted by the letter 
xand the independent variable by/; in the last three, the unknown function 
is denoted by the letter u and it dépends on two arguments, x and t , or 
x and y. 

The great importance of differential équations in mathematics, and 
especially in its applications, is due chiefly to the fact that the investigation 
of many problems in physics and technology may be reduced to the 
solution of such équations. 

Calculations involved in the construction of electrical machinery or of 
radiotechnical devices, computation of the trajectory of projectiles, 
investigation of the stability of an aircraft in flight, or of the course of a 
Chemical reaction, ail dépend on the solution of differential équations. 

It often happens that the physical laws governing a phenomenon are 
written in the form of differential équations, so that the differential 
équations themselves provide an exact quantitative (numerical) expression 
of these laws. The reader will see in the following chapters how the laws of 
conservation of mass and of heat energy are written in the form of dif¬ 
ferential équations. The laws of mechanics discovered by Newton allow 
one to investigate the behavior of any mechanical System by means of 
differential équations. 

Let us illustrate by a simple example. Consider a material particle of 
mass m moving along an axis Ox, and let x dénoté its coordinate at the 
instant of time t. The coordinate x will vary with the time, and knowledge 
of the entire motion of the particle is équivalent to knowledge of the 
functional dependence of x on the time t. Let us assume that the motion 
is caused by some force F, the value of which dépends on the position of 
the particle (as defined by the coordinate *), on the velocity of motion 
v = dx/dt and on the time /, i.e., F = F{x, dx/di, t ). According to the 
laws of mechanics, the action of the force F on the particle necessarily 
produces an accélération w = d 2 x/dt 2 such that the product of w and the 
mass m of the particle is equal to the force, and so at every instant of the 
motion we hâve the équation 


d l x _, dx i 

m— = F[x— , r . 
dt 2 \ dt ' 


(2) 
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This is the differential équation that must be satisfied by the function 
x(f) describing the behavior of the moving particle. lt is simply a représen¬ 
tation of laws of mechanics. lts significance lies in the fact that it enables 
us to reduce the mechanical problem of determining the motion of a 
particle to the mathematical problem of the solution of a differential 
équation. 

Later in this chapter, the reader will find other examples showing how 
the study of various physical processes can be reduced to the investigation 
of differential équations. 

The theory of differential équations began to develop at the end of the 
I7th century, almost simultaneously with the appearance of the differential 
and intégral calculus. At the présent time, differential équations hâve 
become a powerful tool in the investigation of natural phenomena. In 
mechanics, astronomy, physics, and technology they hâve been the means 
of immense progress. From his study of the differential équations of the 
motion of heavenly bodies, Newton deduced the laws of planetary motion 
discovered empirically by Kepler. In 1846 Leverrier predicted the existence 
of the planet Neptune and determined its position in the sky on the basis 
of a numerical analysis of the same équations. 

To describe in general terms the problems in the theory of differential 
équations, we first remark that every differential équation has in general 
not one but infinitely many solutions: that is. there existsan infinité set of 
functions that satisfy it. For example, the équation of motion for a particle 
must be satisfied by any motion induced by the given force F\x, dx/dt, t), 
independently of the starting point or the initial velocity. To each separate 
motion of the particle there will correspond a particular dependence of 
x on time t. Since under a given force F there may be infinitely many 
motions the diflTerential équation (2) will hâve an infinité set of 
solutions. 

Every differential équation defines, in general, a whole class of functions 
that satisfy it. The basic problem of the theory is to investigate the functions 
that satisfy the differential équation. The theory of these équations must 
enable us to form a sufficiently broad notion of the properties of ail 
functions satisfying the équation, a requirement which is particularly 
important in applying these équations to the natural sciences. Moreover, 
our theory must guarantee the means of finding numerical values of the 
functions, if these are needed in the course of a computation. We will 
speak later about how these numerical values may be found. 

If the unknown function dépends on a single argument, the differential 
équation is called an ordinary differential équation. If the unknown function 
dépends on several arguments and the équation contains dérivatives with 
respect to some or ail of these arguments, the differential équation is 
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called a partial differential équation. The first three of the équations in (1) 
are ordinary and the last three are partial. 

The theory of partial differential équations has many peculiarfeatures 
which make them essentially different from ordinary differential équations. 
The basic ideas involved in such équations will be presented in the next 
chapter; here we will examine only ordinary differential équations. 

Let us consider some examples. 


Example 1. The law of decay of radium says that the rate of decay is 
proportional to the initial amount of radium présent. Suppose we know 
that a certain time t = t 0 we had R 0 grams of radium. We want to know 
the amount of radium présent at any subséquent time t. 

Let R(t) be the amount of undecayed radium at time t. The rate of decay 
is given by the value of — (dRjdt). Since this is proportional to R, we hâve 


~W = kR ’ (3) 

where A: is a constant 

In order to solve our problem, it is necessary to détermine a function 
from the differential équation (3). For this purpose we note that the 
function inverse to R(t) satisfies the équation 


dt_ = J_ 
dR kR' 


(4) 


since dtjdR = (1 /dR)/dt. From the intégral calculus it is known that 
équation (4) is satisfied by any function of the form 

t = — ^ In R + C, 


where C is an arbitrary constant. From this relation we détermine R as a 
function of t. We hâve 


R = *-«««: = o-“. (5) 

From the whole set of solutions (5) of équation (3) we must select one 
which for t = t 0 has the value R 0 . This solution is obtained by setting 
C, = R^k 

From the mathematical point of view, équation (3) is the statement 
of a very simple law for the change with time of the function R \ it says that 
the rate of decrease — ( dRjdt ) of the function is proportional to the value 
of the function R itself. Such a law for the rate of change of a function is 



§1. INTRODUCTION 


315 


satisfied not only by the phenomena of radioactive decay but also by 
many other physical phenomena. 

We find exactly the same law for the rate of change of a function, for 
example, in the study of the cooling of a body, where the rate of decrease 
in the amount of heat in the body is proportional to the différence between 
the température of the body and the température of the surrounding 
medium, and the same law occurs in many other physical processes. Thus 
the range of application of équation (3) is vastly wider than the particular 
problem of the radioactive decay from which we obtained the équation. 

Example 2. Let a material point of a mass m be moving along the 
horizontal axis Ox in a resisting medium, for example in a liquid or a 
gas, under the influence of the elastic force of two springs, acting under 
Hooke’s law (figure 1), which States that the elastic force acts toward the 



position of equilibrium and is proportional to the déviation from the 
equilibrium position. Let the equilibrium position occur at the point 
x = 0. Then the elastic force is equal to —bx(b > 0). 

We will assume that the résistance of the medium is proportional to the 
velocity of motion, i.e., equal to — a(dx/dt ), where a > 0 and the minus 
sign indicates that the resisting medium acts against the motion. Such an 
assumption about the résistance of the medium is confirmed by experiment. 

From Newton’s basic law that the product of the mass of a material 
point and its accélération is equal to the sum of the forces acting on it, 
we hâve 


cPx , dx 
m -rir = — bx — a — . 
<* 2 di 


( 6 ) 


Thus the function x(t), which describes the position of the moving point 
at any instant of time i, satisfies the differential équation (6). We will 
investigate the solutions of this équation in one of the later sections. 

If, in addition to the forces mentioned, the material point is acted upon 
by still another force, Foutside of the System, then the équation of motion 
(6) takes the form 
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Example 3. A mathematical pendulum îs a material point of mass m. 



hâve the following numerical 


suspended on a string whose length 
will be denoted by /. We will assume 
that at ail stages the pendulum 
stays in one plane, the plane of the 
drawing (figure 2). The force tend- 
ing to restore the pendulum to the 
vertical position OA is the force 
of gravity mg, acting on the 
material point. The position of 
the pendulum at any time / is 
given by the angle <f> by which it 
differs from the vertical OA. We 
take the positive direction of <f> to 
x be counterclockwise. The arc 
AA' = l<f> is the distance moved by 
the material point from the posi¬ 
tion of equilibrium A. The velocity 
of motion v will be directed along 
the tangent to the circle and will 
value: 



To establish the équation of motion, we décomposé the force of gravity 
mg into two components Q and P, the first of which is directed along the 
radius OA' and the second along the tangent to the circle. The component 
Q cannot affect the numerical value of the rate v, since clearly it is balanced 
by the résistance of the suspension OA'. Only the component P can affect 
the value of the velocity v. This component always acts toward the equi¬ 
librium position A, i.e., toward a decrease in <f>, if the angle <f> is positive, 
and toward an increase in <j>, if <f> is négative. The numerical value of P is 
equal to — mg sin <f>, so that the équation of motion of the pendulum is 


or 



— mg sin <f> 


cP<f> 
dt 2 



(?) 


lt is interesting to note that the solutions of this équation cannot be 
expressed by a finite combination of elementary functions. The set of 
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elementary functions is too small to give an exact description of even 
such a simple physical process as the oscillation of a mathematical pen- 
dulum. Later we will see that the differential équations that are solvable 
by elementary functions are not very numerous, so that it very frequently 
happens that investigation of a differential équation encountered in physics 
or mechanics leads us to introduce new classes of functions, to subject 
them to investigation, and thus to widen our arsenal of functions that 
may be used for the solution of applied problems. 

Let us now restrict ourselves to small oscillations of the pendulum for 
which, with small error, we may assume that the arc AA' is equal to its 
projection x on the horizontal axis Ox and sin<f> is equal to <f>. Then 
<f> «b sin <f> = x\l and the équation of motion of the pendulum will take 
on the simpler form 


<Px 
di 2 



( 8 ) 


Later we will see that this équation is solvable by trigonométrie functions 
and that by using them we may describe with sufficient exaetness the “small 
oscillations” of a pendulum 


Example 4. Helmholtz’ acoustic resonator 
(figure 3) consists of an air-filled vessel V, the 
volume of which is equal to v, with a cylindrical 
neck F. Approximately, we may consider the air 
in the neck of the container as cork of mass 

m = psi, (9) 

where p is the density of the air, s is the area of 
the cross section of the neck, and / is its length. 

If we assume that this mass of air is displaced 
from a position of equilibrium by an amount x, 
then the pressure of the air in the container 
with volume v is changed from the initial value p bysome amount which 
we will call Ap. 

We will assume that the pressure p and the volume v satisfy the 
adiabatic law pv k = C. Then, neglecting magnitudes of higher order, we 
hâve 

Ap ■ v k + pkv k ~ x ■ Av = 0 
and 

a i Av kps 

Ap = -kp -=- —x. 

y V V 



( 10 ) 
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(ln our case, Av = sx.) The équation of motion of the mass of air in the 
neck may be written as: 


d*x . 
m ~d^ = Ap • s. 


(H) 


Here Ap ■ s is the force exerted by the gas within the container on the 
column of air in the neck. From (10) and (11) we get 




( 12 ) 


where p, p, v, l, k, and s are constants. 


Example 5. An équation of the form (6) also arises in the study of 

electric oscillations in a simple 
oscillator circuit. The circuit diagram 
is given in (figure 4). Here on the left 
we hâve a condenser of capacity C, 
in sériés with a coil of inductance L, 
and a résistance R. At some instant 
let the condenser hâve a voltage 
across its terminais. In the absence 
of inductance from the circuit, the 
current would flow until such time 
as the terminais of the condenser 
were at the same potential. The presence of an inductance alters the 
situation, since the circuit will now generate electric oscillations. To find a 
law for these oscillations, we dénoté by v(i), or simply by i>, the voltage 
across the condenser at the instant t, by !(t) the current at the instant t, 
and by R the résistance. From well-known laws of physics, I(t)R remains 
constantly equal to the total electromotive force, which is the sum of the 
voltage across the condenser and the inductance —L(d//dt). Thus, 

/ R=-v-L d l. (13) 

We dénoté by Q(i) the charge on the condenser at time t. Then the 
current in the circuit will, at each instant, be equal to dQldt. The potential 
différence v(r) across the condenser is equal to Q(t)/C. Thus I = dQ/dt — 
C ( dv/dt ) and équation (13) may be transformed into 


-mwMr 


o 

O 
O 
O 
L O 
O 
O 
O 
O 


Fig. 4. 


( 14 ) 
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Example 6. The circuit diagram of an electron-tube generator of 
electromagnetic oscillations is shown in figure 5. The oscillator circuit 
consisting of a capacitance C, across a résistance R and an inductance L, 
represents the basic oscillator System. The coil L' and the tube shown in 
the center of figure 5 from a so-called “feedback.” They connect a source 
of energy, namely the battery B, with the L-R-C circuit; K is the cathode 
of the tube, A the plate, and S the grid. In such an L-R-C circuit “self- 
oscillations” will arise. For any actual System in an oscillatory State the 
energy is transformed into heat or is dissipated in some other form to the 
surrounding bodies, so that to maintain a stationary State of oscillation it 
is necessary to hâve an outside source of energy. Self-oscillations diflfer 
from other oscillatory processes in that to maintain a stationary oscillatory 
State of the System the outside source does not hâve to be periodic. 
A self-oscillatory System is constructed in such a way that a constant 
source of energy, in our case the battery B, will maintain a stationary 
oscillatory State. Examples of self-oscillatory Systems are a clock, an 
electric bell, a string and bow moved by the hand of the musician, the 
human voice, and so forth. 




Fig. 6. 


The current /(/) in the oscillatory L-R-C circuit satisfies the équation 

i l+"+' = u if- <15) 

Here t> = v(t) is the voltage across the condenser at the instant t, I„(i ) 
is the plate current through the coil L'\ M is the coupling coefficient 
between the coils L and L'. In comparison with équation (13), équation 
(15) contains the extra term M(dIJdt). 

We will assume that the plate current I a (t) dépends only on the voltage 
between the grid S and the cathode of the tube (i.e., we will neglect the 
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reactance of the anode), so that this voltage is equal to the voltage v(t) 
across the condenser C. The character of the functional dependence of 
/„ on v is given in figure 6. The curve as sketched is usually taken to be a 
cubical parabola and we write an approximate équation for it by: 

/„ = o,i» + o 2 i> 2 + a 3 v 3 . 

Substituting this into the right side of équation (15), and using the fact 
that 



we get for v the équation 

L S + lR - M(a ' + 2û2V + 3flsl,2)1 Jt + v = °- (l6) 

ln the examples considered, the search for certain physical quantities 
characteristic of a given physical process is reduced to the search for 
solutions of ordinary differential équations. 

Problems in the theory of differential équations. We now give exact 
définitions. An ordinary differential équation of order n in one unknown 
function y is a relation of the form 

n*. ax), y\x\ /'(*), -, y«w] = o (i7) 

between the independent variable x and the quantities 

Ax), y\x) = d £, y\x) = g, -, y-»(jr) = g . 

The order of a differential équation is the order of the highest dérivative 
of the unknown function appearing in the differential équation. Thus the 
équation in example 1 is of the first order, and those in examples 2, 3, 4, 5, 
and 6, are of the second order. 

A function <f>(x) is called a solution of the differential équation (17) if 
substitution of <f>(x) for y, <f>'(x) for ÿ, ■ ■ ■, <f> in) (x) for y ln) produces an 
identity. 

Problems in physics and technology often lead to a System of ordinary 
differential équations with several unknown functions, ail depending on 
the same argument and on their dérivatives with respect to that argument. 

For greater concreteness, the explanations that follow will deal chiefly 
with one ordinary differential équation of order not higher than the second 
and with one unknown function. With this example one may explain the 
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essential properties of ail ordinary diflerential équations and of Systems 
of such équations in which the number of unknown functions is equal to 
the number of équations. 

We hâve spoken earlier of the fact that, as a rule, every diflerential 
équation has not one but an infinité set of solutions. Let us illustrate this 
first of ail by intuitive considérations based on the examples given in 
équations (2-6). In each of these, the corresponding diflerential équation 
is already fully defined by the physical arrangement of the System. But in 
each of these Systems there can be many different motions. For example, 
it is perfectly clear that the pendulum described by équation (8) may 
oscillate with many different amplitudes. To each of these different oscil¬ 
lations of the pendulum there corresponds a different solution of équation 
(8), so that infinitely many such solutions must exist. lt may be shown that 
équation (8) is satisfied by any function of the form 

x = C } cosyjji + CiSin^Jj t, (18) 

where C, and C 2 are arbitrary constants. 

lt is also physically clear that the motion of the pendulum will be 
completely determined only in case we are given, at some instant i 0 , 
the (initial) value x 0 of x (the initial displacement of the material point 
from the equilibrium position) and the initial rate of motion 
x' 0 = (dx/di) |,_ 0 . These intial conditions détermine the constants C, and 
C 2 in formula (18). 

In exactly the same way, the diflerential équations we hâve found in 
other examples will hâve infinitely many solutions. 

In general, it can be proved, under very broad assumptions concerning 
the given diflerential équation (17) of order n in one unknown function 
that it has infinitely many solutions. More precisely: lf for some “initial 
value” of the argument, we assign an “initial value” to the unknown 
function and to ail of its dérivatives through order n — 1, then one can 
find a solution of équation (17) which takes on these preassigned initial 
values, lt may also be shown that such initial conditions completely 
détermine the solution, so that there exists only one solution satisfying 
the initial conditions given earlier. We will discuss this question later in 
more detail. For our présent aims, it is essential to note that the initial 
values of the function and the first n — 1 dérivatives may be given 
arbitrarily. We hâve the right to make any choice of n values which define 
an “initial State” for the desired solution. 

lf we wish to construct a formula that will if possible include ail solutions 
of a diflerential équation of order n, then such a formula must contain n 
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independent arbitrary constants, which will allow us to impose n initial 
conditions. Such solutions of a difFerential équation of order n, containing 
n independent arbitrary constants, are usually called general solutions 
of the équation. For example, a general solution of (8) is given by formula 
(18) containing two arbitrary constants; a general solution of équation (3) 
given by formula (5). 

We will now try to formulate in very general outline the problems 
confronting the theory of difFerential équations. These are many and 
varied, and we will indicate only the most important ones. 

If the difFerential équation is given together with its initial conditions, 
then its solution is completely determined. The construction of formulas 
giving the solution in explicit form is one of the first problems of the theory. 
Such formulas may be constructed only in simple cases, but if they are 
found, they are of great help in the computation and investigation of the 
solution. 

The theory should provide a way to obtain some notion of the behavior 
of a solution: whether it is monotonie or oscillatory, whether it is periodic 
or approaches a periodic function, and so forth. 

Suppose we change the intial values for the unknown function and its 
dérivatives; that is, we change the intial State of the physical System. Then 
we will also change the solution, since the whole physical process will 
now run difTerently. The theory should provide the possibility of judging 
what this change will be. In particular, for small changes in the initial 
values will the solution also change by a sinall amount and will it therefore 
be stable in this respect, or may it be that small changes in the initial 
conditions will give rise to large changes in the solution so that the latter 
will be unstable ? 

We must also be able to set up a qualitative, and where possible, 
quantitative picture of the behavior not only of the separate solutions of 
an équation, but also of ail of the solutions taken together. 

In machine construction there often arises the question of making a 
choice of parameters characterizing an apparatus or machine that will 
guarantee satisfactory operation. The parameters of an apparatus appear 
in the form of certain magnitudes in the corresponding difFerential 
équation. The theory must help us make clear what will happen to 
the solutions of the équation (to the working of the apparatus) if 
we change the differential équation (change the parameters of the 
apparatus). 

Finally, when it is necessary to carry out a computation, we will need 
to find the solution of an équation numerically, and here the theory will 
be obliged to provide the engineer and the physicist with the most rapid 
and economical methods for calculating the solutions. 
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§2. Linear Différentiel Equations with Constant Coefficients 

For certain important classes of ordinary differential équations the 
general solution may be expressed in ternis of simple well-known functions. 
One of these classes consists of those differential équations with constant 
coefficients that are linear with respect to the unknown function and its 
dérivatives (in short, linear). The differential équations (3), (6), (8), and 
(14) are examples of such équations. A linear équation is called homo- 
geneous if it has no term which does not contain the unknown variable, 
and nonhomogeneous if there is such a term. 

Homogeneous linear équations of the second order with constant 
coefficients. Such équations hâve the form 

«Sr< 6 > 

where m, a, and b are constants. We will assume that m is positive; this 
does not restrict the generality, since we can always ensure this situation 
if need be by changing the sign of ail coefficients, provided that m ^ 0, 
which we will assume. 

We will look for a solution of this équation in the form of an expo- 
nential function e u and ask how the constant Ashouldbe chosen so that 
the function x = satisfies the équation. Putting x = e*', dxtdt = Xe xt 
and d 2 x/dt 2 = AV in the left side of équation (6), we get 

^'(mX 2 + aX + b). 

Thus, in order that x(t) = e 1 ' be a solution of équation (6) it is necessary 
and sufficient that 

mX 2 + aX + b = 0. (19) 

lf A, and A 2 are two real roots of équation (19), then it is easy to prove that 
a solution of équation (6) is given every function of the form 

x = C,^‘ + <V«‘, (20) 


where C, and C 2 are arbitrary constants. 

Below we will show that formula (20) gives ail solutions of équation 
(6) in the case that équation (19) has distinct real roots. 

We note the following important properties of the solution of équation 

( 6 ): 

1. The sum of two solutions is also a solution. 

2. A solution multiplied by a constant is also a solution. 
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In case A, is a multiple root of équation (19), i.e., wA, + aA, + b = 0 
and 2wA, + a = 0,* then a solution of équation (6) will also be given by 
the function te* 1 ', since if we substitute this function and its dérivatives 
into the left side of équation (6) we get 

/^«‘(/wAj + aA, + b) + e i ' , (2mÀ t + a), 

which is seen from the previous équations to be identically zéro. 

The general solution of équation (6) in this case has the form 

* = CV' 1 + (21) 

Now let équation (19) hâve complex roots. These roots will be complex 
conjugates of each other since m, a, and b are real numbers. Let A = a ± ;/3 
The équation 

m(a + f/3) 2 + a(a + //?) + b = 0 

is équivalent to the two équations 

ma? — mfï 2 + aa + b = 0 and 2/waq8 + ay8 = 0. (22) 

lt is easy to show that in this case the functions x = e ri cos y8/ and 
x = e*' sin j 8 1 are solutions of équation (6). Thus, for example, putting 
the function x(t) = é*‘ cos fit and its dérivatives in the left side of équation 
(6), we get 

e*' cos /3f(/r»ot 2 — mfp + aa + b) — ef* sin /3r(2ma/3 + aj8). 

By équation (22) this expression is identically equal to zéro. 

The general solution of équation (6), if équation (19) has complex roots, 
has the form 

x = C,e*' sin fit + C 2 e xl cos /9 1 , (23) 

where C, and C 2 are arbitrary constants. 

In this way, if we know the roots of équation (19), called the charac- 
teristic équation , we can write down the general solution of équation (6). 

We note that the general solution of a linear homogeneous équation of 
order n with constant coefficients 

d n x d n ~ l x dx 

û " Ir + û "- + - + Ht + = 0 

may be written in a similar manner as a polynomial in exponential and 

* The sum of the roots A, and A, of the quadratic équation (19) is A, + A, = —a/m, 
and if the roots are the same, that is A, = A, , then the second of the previous équations 
is true. 
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trigonométrie functions, provided we know the roots of the algebraic 
équation 

a n A” + «-.A- 1 + - + a 0 = 0, 

which again is called the characteristic équation. Thus, the problem of 
integrating a linear ordinary differential équation with constant coefficients 
is reduced to an algebraic problem. 

We now show that formulas (20), (21), and (23) give ail the solutions 
of équation (6). We note that C, and C 2 in these formulas may always be 
so chosen that the function x(t) satisfies arbitrary initial conditions 
*(to) = *o » x'(t 0 ) = x' 0 . For this C, and C 2 need only to be determined 
from the System of équations 

* 0 = <>**■» + C^\ 
x'„ = + A 2 C 2 e 1 *'*. 

in the case of formula (20), or by two similar équations in the case of 
formulas (21) and (23). Clearly, if there existed a solution of équation 
(6) not contained among the solutions we hâve constructed, then there 
would exist two distinct solutions of équation (6) satisfying the same 
initial conditions. Their différence *,(/) would not be identically zéro and 
would satisfy the zéro initial conditions *,(r 0 ) = 0, = 0- We will 

show that a solution of équation (6) which satisfies the zéro initial condi¬ 
tions can only be *,(/) = 0. Let us first show this under the assumption 
that m > 0, a > 0, and b > 0. We multiply the two sides of the équation 

m Tr + a T +bx '- 0 (24) 

by 2 (dxjdt). Since 


■y dx \ . d * X l 

dt dt 2 


d / dx | \ 4 
dt l dt ’ 


and 


2*,(0 


dxi 

dt 



équation (24) may be put in the form 

+ + *!«»»-a 

Integrating this identity between t 0 and t, we get 

-ÆM. (*)■*+•■#>-* 

f o 


This équation is possible only if x,(t) s 0. Otherwise, for t = t 0 , we would 
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clearly hâve a positive quantity on the left and zéro on the right, with a 
similar situation for t < t 0 . 

In order to establish our proposition for ail constant coefficients m, a, 
and b, we consider the function y x (t) = Xi(/)e - “‘ which, as it is easy to 
show, also satisfies the zéro boundary condtions. If the value of a > 0 is 
chosen sufficiently large, then the function y,(r) will satisfy some équation 
of the form (6) for a > 0, b > 0, and m > 0. This équation is easily 
derived by substituting the function *,(0 = y t (t)e*‘ and its dérivatives 
into équation (6). Then, as was shown earlier, we hâve y t (t) = 0, which 
means that *,(/) = == 0. 

Thus we hâve shown that formulas (20), (21), and (23) give ail the 
solutions of équation (6). 

Let us see what information these formulas give about the character 
of the solutions of équations (6). To this end we note the formulas 


A 


1.2 


a 

2m 


± 


/ a* _.b_ 

V 4m 2 m 


(25) 


for the roots of équation (19). In accordance with the physical applications 
which led us to équation (6), we will assume m > 0, a ^ 0, and b > 0. 

Case I. a 2 > 4 bm. The two roots of the characteristic équation (19) 
are real, négative, and distinct. In this case the function x(t) given by 
by formula (20) is a general solution of équation (6). Ail the functions 
given by this formula together with their first dérivatives tend to zéro for 
t — *■ + oo, and there is no more than one value of t for which they vanish. 
lt follows that the function x(i) has no more than one maximum or 
minimum. Physically, this means that the résistance of the medium is 
sufficiently large to prevent oscillations. The moving point cannot pass 
through the equilibrium position x = 0 more than once. From then on, 
after attaining a maximum distance from the point x = 0, it will begin a 
slow approach to the point but will never pass through it again. 

Case 2. a 2 = 4bm. The two roots of équation (19) are equal to each 
other and the general solution of équation (6) given by formula (21). 
In this case again ail solutions x(t) and their first dérivatives tend to zéro 
for t -<■ + oo. Here x(t) and x'(t) cannot vanish more than once. The 
character of the motion of the material point with abscissa x(r) is the same 
as in the first case. 


Case 3. a 2 < 4 bm. The roots of the characteristic équation (19) hâve 
nonzero imaginary part. The general solution of équation (6) is given by 
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formula (23). The point x performs oscillations along the *-axis with a 
constant period 2 tt// 3, which is the same for ail solutions of (6), and with 
amplitude Ce 3 ', where a = —(a/2m). 

The oscillations of a physical System which take place without the action 
of an exterior force are called characteristic oscillations (eigenvibrations) 
of the System. From the previous discussion, it follows that the period of 
such oscillations for the Systems discussed in examples 2, 3, 4 and 5, 
dépends only on the structure of the System and will be the same for 
ail oscillations which could possibly arise in it. In example 2 this period 
is equal to 27 t: Vb/m — a 2 /4n? ; in example 4 to 277: Vkps/vpl', and 
example 5 to 2w: VT/LC - JP/42*. 

If a = 0, i.e., if the medium offers no résistance to the motion, then 
the amplitude of the oscillations is constant: the point oscillâtes harmoni- 
cally. But if a > 0, i.e., if the medium offers résistance to the motion, 
although this résistance is small ( a 2 < 46m), then the amplitude of the 
oscillations tends to zéro and the oscillations die out. 

Finally, the solution x(r) =2 0 of équation (6) in ail cases indicates a 
State of rest for the point x at the position x = 0, which is called the 
position of equilibrium. 

If the real parts of both roots of équation (19) are négative, then it can 
be seen from formulas (20), (21), and (23), that ail the solutions of équation 
(6), together with their dérivatives, tend to zéro for t ->■ + co; that is that 
is, the oscillations die out with the passage of time. 

However, if the real part of even one of the roots of équation (19) is 
positive, then there are solutions of équation (6) not tending to zéro for 
/ -► + <x>, so that some of the solutions of (6) would not even be bounded 
for t -*■ + 00 . Such a case can occur only for négative 6 or négative a, 
if m > 0. Physically, this would correspond to the case in which the 
elastic force does not attract the point x to the equilibrium position but 
repels it or else that the résistance of the medium is négative. Such cases 
cannot be realized in the physical examples considered at the beginning 
of this chapter, but they are entirely realizable in other physical models. 

If the real part of the roots A, and A 2 of équation (19) is equal to zéro, 
which is possible only if the coefficient a in équation (19) is zéro, then for 
a = 0 the point x(t), as can be seen from formula (23), carries out 
harmonie oscillations with bounded amplitude and bounded velocity. 

Nonhomogeneous linear équations with constant coefficients. Let us 
consider in detail the équation 
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This is the équation of linear oscillations of a material point under the 
action of an elastic force, of the résistance of a medium and of an external 
periodic force A cos oit (see équation (6') in §1). 

Equation (26) is a nonhomogeneous linear équation and (6) is the corre- 
sponding homogeneous équation. 

We will now look for the general solution to équation (26). 

We note that the sum of a solution of a nonhomogeneous équation and 
a solution of the corresponding homogeneous équation is also a solution 
of the nonhomogeneous linear équation. Thus, in order to find a general 
solution of équation (26), it is sufficient to find any one particular solution. 
The general solution of équation (26) will then be given in the form of 
the sum of this particular solution and a general solution of the corre¬ 
sponding homogeneous équation. 

It is natural to expect that the motion will follow the rhythm of the 
external periodic force and to look for a particular solution of équation 
(26) in the form x = B cos (oit + 8), where B and 8 are as yet undeter- 
mined constants. We will attempt to détermine B and 8 in such a way that 
the function x = fl cos (oit + S) will satisfy équation (26). Calculating the 
dérivatives dx/dt = — Boi sin (oit + 8) and cPx/dt 2 = — Boi 2 cos (oit -(- 8) 
and substituting them into équation (26), we get 

m[ — Boi 2 cos (oit -|- 8)] + o[ — Boi sin (oit + 8)] 

4- b B cos (oit + 8) = A cos oit. 

Applying well-known formulas, we hâve 

B[(b — moi 2 ) cos (oit + 8) — aoi sin (oit + S)] 

= B V(b — moi 2 ) 2 + a 2 oi 2 cos (oit + 8') = A cos oit, 
where 8' = 8 + y and y = arc tan aoi/(b — moi 2 ). Obviously, if we set 


8 = —arc tan 


aoi 


b — moi 2 


and fl = 


V(b - moi 2 ) 2 + a 2 oi 2 


the function x = S cos (oit + 8) will satisfy équation (26). 

A solution of the form fl cos (oit + 8) will always exist if (b — moi 2 ) 2 4- 
a 2 oi 2 0 . In case (b — moi 2 ) 2 + a 2 oi 2 = 0 , i.e., when a = 0 and b = moi 2 , 
équation (26) has the form 


m - rr + moi 2 x = A cos oit. 
dt 2 

A particular solution in this case, as is easily established, is 
x = (At/2 Vmb ) sin oit. 
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Solutions of the nonhomogeneous équation (26) are called forced 
oscillations. The multiplier <f>{w) = l/V(b — moi 2 ) 2 + a 2 io 2 characterizes 
the relation of the amplitude fiof the forced oscillation to the amplitude A 
of the disturbing force. The graph of the function <f>(œ) is called the 
résonance curve. The frequency <u for which <f>(aj) attains its maximum is 
called the résonant frequency. Let us calculate it. If <f>(w ) attains the 
maximum at tu, ^ 0, then for this value of tu the dérivative vanishes, 
i.e., — 4(6 — mo)\) «tu, + 2a 2 tu, = 0, so that tu, = Vb/m — a 2 /2m 2 . For 
this value of tu, 


a V b/m — a 2 /Am 2 


Hence it can be seen that the amplitude of the forced oscillation for 
tu = tu, is greater for smaller values of a. For very small a, the frequency 
tu, is very close to the value y/bjm, i.e., to the frequency the free oscil¬ 
lations. For a = 0 and b = mw 2 , as we saw, the forced oscillation has the 
form 


x = 


Al 

2 Vmb 


sin tu/. 


i.e., the amplitude of this oscillation increases beyond ail bounds as 
/ -* 4- oo, a situation which represents the mathematical meaning of 
résonance. Résonance will occur if the period of the external force is the 
same as the period of one of the characteristic oscillations of the System. 
In the practical world, in cases where the period of the external force and 
the period of the characteristic oscillations are close together, the 
displacements of the System may become extremly large. 

The possibility of large oscillations is often made use of in the con¬ 
struction of various kinds of amplifiers, for example in radio technology. 
But large oscillations may also lead to the breaking up of structures such 
as bridges or the framework of machines. Thus it is very important to 
foresee the possibility of résonance or of oscillations close to it. 

From the remarks made earlier, any solution of équation (26) can be 
written as a sum of the forced oscillation we hâve found and of one of 
the solutions of the homogeneous équation given in formulas (20), (21), 
and (23). For a > 0 and b > 0 the solution of the homogeneous équation 
tends to zéro for / ->■ + oo, i.e., any motion eventually approximates the 
forced oscillations. If a = 0 and b > 0, the forced oscillation is superposed 
on a nondecaying characteristic oscillation of the System. For b = mœ 2 
and a = 0, we hâve résonance. 

If a periodic external force /(/) is imposed on the sytem, the forced 
oscillations of the System may be found in the following manner. We 
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may represent f(t) with sufficient exactness as a segment of a trigonométrie 
sériés* 

n 

2) (a. cos œ,t + bi sin a >,t). (27) 

i-i 


Let us find the forced oscillations corresponding to each term of this sum. 
Then the oscillation corresponding to the force f(t) will be found by adding 
together the oscillations corresponding to the various terms of the sum 
(27). If any of these frequencies is identical with the frequency of a charac- 
teristic oscillation of the System, we will hâve résonance. 


§3. Sonie General Remarks on the Formation and Solution of 
Differential Equations 

There are not many differential équations with the property that ail 
their solutions can be expressed explicitly in terms of simple functions, as 
is the case for linear équations with constant coefficients, lt is possible to 
give simple examples of differential équations whose general solution 
cannot be expressed by a finite number of intégral of known functions, or 
as one says, in quadratures. 

As Liouville showed in 1841, the solution of the Ricatti équation of 
the form dyjdx + ay 2 = x 2 , for a > 0, cannot be expressed as a finite 
combination of intégrais of elementary functions. So it becomes important 
to develop methods of approximation to the solutions of differential 
équations, which will be applicable to wide classes of équations. 

The fact that in such cases we find not exact solutions but only approxi¬ 
mations should not bother us. First of ail, these approximate solutions 
may be calculated, at least in principle, to any desired degree of accuracy. 
Second, it must be emphasized that in most cases the differential équations 
describing a physical process are themselves not altogether exact, as can 
be seen in ail the examples discussed in §1. 

An especially good example is provided by the équation (12) for the 
acoustic resonator. In deriving this équation, we ignored the compres- 
sibility of the air in the neck of the container and the motion of the air 
in the container itself. As a matter of fact, the motion of the air in the 
neck sets into motion the mass of the air in the vessel, but these two 
motions hâve different velocities and displacements. In the neck the 
displacement of the particles of air is considerably greater than in the 
container. Thus we ignored the motion of the air in the container, and 


Cf. Chapter XII, §7. 
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took account only of its compression. For the air in the neck, however, 
we ignored the energy of its compression and took account only of the 
kinetic energy of its motion. 

To dérivé the differential équation for a physical pendulum, we ignored 
the mass of the string on which it hangs. To dérivé équation (14) for 
electric oscillations in a circuit, we ignored the self-inductance of the wiring 
and the résistance of the coils. In general, to obtain a differential équation 
forany physical process, we must always ignore certain factors and idealize 
others. In view of this, A. A. Andronov drew especial attention to the 
fact that for physical investigations we are especially interested in those 
differential équations whose solutions do not change much for arbitrary 
small changes, in some sense or another, in the équations themselves. 
Such differential équations are called “intensitive.” These équations 
deserve particularly complété study. 

It should be stated that in physical investigations not only are the 
differential équations that describe the laws of change of the physical 
quantities themselves inexactly defined but even the number of these 
quantities is defined only approximately. Strictly speaking, there are no 
such things as rigid bodies. So to study the oscillations of a pendulum, 
we ought to take into account the deformation of the string from which 
it hangs and the deformation of the rigid body itself, which we approxi- 
mated by taking it as a material point. In exactly the same way, to study 
the oscillations of a load attached to springs, we ought to consider the 
masses of the separate coils of the springs. But in these examples it is easy 
to show that the character of the motion of the different particles, which 
make up the pendulum and its load together with the springs, has little 
influence on the character of the oscillation. If we wished to take this 
influence into account, the problem would become so complicated that 
we would be unable to solve it to any suitable approximation. Our solution 
would then bear no doser relation to physical reality than the solution 
given in §1 without considération of these influences. Intelligent idealiza- 
tion of a problem is always unavoidable. To describe a process, it is 
necessary to take into account the essential features of the process but by 
no means to consider every feature without exception. This would not 
only complicate the problem a great deal but in most cases would resuit 
in the impossibility of calculating a solution. The fundamental problem 
of physics or mechanics, in the investigation of any phenomenon, is to 
find the smallest number of quantities, which with sufficient exactness 
describe the State of the phenomenon at any given moment, and then 
to set up the simplest differential équations that are good descriptions of 
the laws governing the changes in these quantities. This problem is often 
very difficult. Which features are the essential ones and which are non- 
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essential is a question that in the final analysis can be decided only by long 
expérience. Only by comparing the answers provided by an idealized 
argument with the results of experiment can we judge whether the idealiza- 
tion was a valid one. 

The mathematical problem of the possibility of decreasing the number 
of quantities may be formulated in one of the simplest and most charac- 
teristic cases, as follows. 

Suppose that to begin with we characterize the State of a physical System 
at time t by the two magnitudes x-,(r) and x 2 (t). Let the differential 
équations expressing their rates of change hâve the form 


^ = AC, * 1 , **), 
« = x 1 ’ 


(28) 


In the second équation the coefficient of the dérivative is a small constant 
parameter t. If we put e = 0, the second of équations (28) will cease to be 
a differential équation. It then takes the form 

M>, x t , x 2 ) = 0. 

From this équation, we define x 2 as a function of t and x, and we substitute 
it into the first of the équations (28). We then hâve the differential équation 


for the single variable x ,. In this way the number of parameters entering 
into the situation is reduced to one. We now ask, under what conditions 
will the error introduced by taking e = 0 be small. Of course, it may 
happen that as e —► 0 the value dxi/dt grows beyond ail bounds, so that 
the right side of the second of équations (28) does not tend to zéro as 
e — 0. 


§4. Géométrie Interprétation of the Problem of 

Integrating Differential Equations; Generalization of the Problem 

For simplicity we will consider initially only one differential équation 
of the first order with one unknown function 

%=Ax,y), (29) 

where the function f(x, y) is defined on some domain G in the (x, y) plane. 



§4. GEOMETRIC INTERPRETATION 


333 


This équation détermines at each point of the domain the slope of the 
tangent to the graph of a solution of équation (29) at that point. If at each 
point (*, y) of the domain G we indicate by means of a line segment the 
the direction of the tangent (either of the two directions may be used) as 
determined by the value of f(x, y) at this point, we obtain a field of direc¬ 
tions. Then the problem of finding a solution of the differential équation 
(29) for the initial conditon y(x 0 ) = y 0 may be formulated thus: ln the 
domain G we hâve to find a curve y = <f>(x), passing through the point 
M 0 (x 0 , y 0 ), which at each of its points has a tangent whose slope is given 
by équation (29), or briefly, which has at each of its points a preassigned 
direction. 

From the géométrie point of view this statement of the problem has 
two unnatural features: 

1. By requiring that the slope of the tangent at any given point (x, y) 
of the domain G be equal to /(*, y), we automatically exclude tangents 
parallel to Oy, since we generally consider only finite magnitudes; in 
particular, it is assumed that the function f(x, y) on the right side of 
équation (29) assumes only finite values. 

2. By considering only curves which are graphs of functions of x, we 
also exclude those curves which are intersected more than once by a line 
perpendicular to the axis Ox, since we consider only single-valued func¬ 
tions; in particular, every solution of a differential équation is assumed to 
be a single-valued function of x. 

So let us generalize to some extent the preceding statement of the 
problem of finding a solution to the differential équation (29). Namely, 
we will now allow the tangent at some points to be parallel to the axis Oy. 
At these points, where the slope of the tangent with respect to the axis 
Ox has no meaning, we will take the slope with respect to the axis Oy. 
ln other words, we consider, together with the differential équation (29), 
the équation 

j y =U*,y), (29') 

where /,(*, y) — 1 //(x, y), if/(.r, y) 0, using the second équation when 
the first is meaningless. The problem of integrating the differential 
équations (29) and (29') then becomes: ln the domain G to find ail curves 
having at each point the tangent defined by these équations. These curves 
will be called intégral curves (intégral fines) of the équations (29) and 
(29') or of the tangent field given by these équations. In place of the 
plural “équations (29), (29')”, we will often use the singular “équation 
(29), (29')”. lt is clear that the graph of any solution of équation (29) 
will also be an intégral curve of équation (29), (29'). But not every intégral 
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curve of équation (29), (29') will be the graph of a solution of équation 
(29). This case will occur, for example, if some perpendicular to the axis 
Ox intersects this curve at more than one point. 

In what follows, if it can be clearly shown that 


/(*, y) = 

then we will write only the équation 


M{x,ÿ) 
N(x, y) ’ 


and omit writing 


dy = M(x, y) 
dx N(x,y) ’ 

dx = N(x, y) 
dy M{x, y) ' 


Sometimes in place of these équations we introduce a parameter /, and 
write the System of équations 

j, = H(x, y), ^ = M(x, y), 

where x and y are considered as functions of i. 


Example 1. The équation 


d _y = y 

dx x 


defines a tangent field everywhere except at the origin. This tangent field 
is sketched in figure 7. Ail the tangents given by équation (30) pass 
through the origin. 
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lt is clear that for every k the function 


y + kx (31) 

is a solution of équation (30). The collection of ail intégral curves of this 
équation is then defined by the relation 

ax = by = 0, (32) 

where a and b are arbitrary constants, not both zéro. The axis Oy is an 
intégral curve of équation (30), but it is not the graph of a solution of it. 

Since équation (30) does not define a tangent field at the origin, the 
curves (31) and (32) are, strictly speaking, intégral curves everywhere 
except at the origin. Thus it is more correct to say that the intégral curves 
of équation (30) are not straight lines passing through the origin but 
half lines issuing from it. 

Example 2. The équation 

$--ï (33) 

dx y 

defines a field of tangents everywhere except at the origin, as sketched in 
figure 8. The tangents defined at a given point (x, y) by équations (30) 
and (33) are perpendicular to each other. It is clear that ail circles centered 
at the origin will be intégral curves of équation (33). However the solutions 
of this équation will be the functions 

y = + VR* - x 2 , y = - VR 2 ~ x\ -R ^x^R. 

For brevity in what follows we will sometimes say “a solution passes 
through the point (x, y)” in place of the more exact statement “the graph 
of a solution passes through the point (x, y).” 


§5. Existence and Uniqueness of the Solution of a 

Differential Equation; Approximate Solution of Equations 

The question of existence and uniqueness of the solution of a differential 
équation. We return to the differential équation (17) of arbitrary order n. 
Generally, it has infinitely many solutions and in order that we may pick 
from ail the possible solutions some one spécifie one, it is necessary to 
attach to the équation some supplementary conditions, the number of 
which should be equal to the order n of the équation. Such conditions 
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may be of extremely varied character, depending on the physical, mechan- 
ical, or other significance of the original problem. For example, if we hâve 
to investigate the motion of a mechanical system beginning with some 
spécifie initial State, the supplementary conditions will refer to a spécifie 
(initial) value of the independent variable and will be called initial 
conditions of the problem. But if we want to define the curve of a 
cable in a suspension bridge, or of a loaded beam resting on supports at 
each end, we encounter conditions corresponding to different values of 
the independent variable, at the ends of the cable orat the points of support 
of the beam. We could give many other examples showing the variety of 
conditions to be fulfilled in connection with differential équations. 

We will assume that the supplementary conditions hâve been defined 
and that we are required to find a solution of équation (17) that satisfies 
them. The first question we must consider is whether any such solution 
exists at ail. It often happens that we cannot be sure of this in advance. 
Assume, say, that équation (17) is a description of the operation of some 
physical apparatus and suppose we want to détermine whether periodic 
motion occurs in this apparatus. The supplementary conditions will 
then be conditions for the periodic répétition of the initial state in the 
apparatus, and we cannot say ahead of time whether or not there will 
exist a solution which satisfies them. 

In any case the investigation of problems of existence and uniqueness 
of a solution makes clear just which conditions can be fulfilled for a given 
differential équation and which of these conditions will define the solution 
in a unique manner. But the détermination of such conditions and the 
proof of existence and uniqueness of the solution for a differential équation 
corresponding to some physical problem also has great value for the 
physical theory itself. lt shows that the assumptions adopted in setting 
up the mathematical description of the physical event are on the one 
hand mutually consistent and on the other constitute a complété descrip¬ 
tion of the event. 

The methods of investigating the existence problem are manifold, but 
among them an especially important rôle is played by what are called 
direct methods. The proof of the existence of the required solution is 
provided by the construction of approximate solutions, which are proved 
to converge to the exact solution of the problem. These methods not 
only establish the existence of an exact solution, but also provide a way, 
in fact the principal one, of approximating it to any desired degree of 
accuracy. 

For the rest of this section we will consider, for the sake of definiteness, 
a problem with initial data, for which we will illustrate the ideas of Euler’s 
method and the method of successive approximations. 
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Euler’s method of broken Unes. Consider in some domain G of the 
(. x, y) plane the differential équation 

d £=f^y)- ( 34 ) 

As we hâve already noted, équation (34) defines in G a field of tangents. 
We choose any point (*„, y 0 ) of G. Through it there will pass a straight 
line L 0 with slope f(x 0 , y 0 )- 
On the straight line L 0 we 
choose a point ( x ,, y,), suf- 
ficiently close to ( x 0 , y 0 ); in 
figure 9 this point is indic- 
ated by the number 1. We 
draw the straight line Z., 
through the point (x,,/,) 
with slope /(at, , y t ) and on 
it mark the point (x 2 , y ^; 
in the figure this point is 
denoted by the number 2. Fig. 9. 

Then on the straight line L 2 

corresponding to the point (x 2 , / 2 ) we mark the point ( x 3 , >> 3 ), and 
continue in the same manner with x 0 < x, < x 2 < x 3 < •••. It is assumed, 
of course, that ail the points (Xo.^o), (x,, y,). (x 2 ,y 2 ), are in the 
domain G. The broken line joining these points is called an Euler broken 
line. One may also construct an Euler broken line in the direction of 
decreasing x; the corresponding vertices on our figure are denoted by 
- 1, -2, -3. 

It is reasonable to expect that every Euler broken line through the point 
(x„. To) with sufficiently short segments gives a représentation of an 
intégral curve / passing through the point (x 0 , ,v 0 ). and that with decrease 
in the length of the links, i.e., when the length of the longest link tends to 
zéro, the Euler broken line will approximate this intégral curve. 

Here, of course, it is assumed that the intégral curve exists. In fact it is 
not hard to prove that if the function f(x, y) is continuous in the domain 
G, one may find an infinité sequence of Euler broken Unes, the length of 
the largest links tending to zéro, which converges to an intégral curve /. 
However, one usually cannot prove uniqueness: there may exist different 
sequences of Euler broken Unes that converge to different intégral curves 
passing through one and the same point (a: 0 , y 0 ). M. A. Lavrent’ev has 
constructed an example of a differential équation of the form (29) with a 
continuous function /[x, y), such that in any neighborhood of any point P 
of the domain G there passes not one but at least two intégral curves. 
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In order that through every point of the domain G there pass only one 
intégral curve, it is necessary to impose on the function f(x, y) certain 
conditions beyond that of continuity. Il is sufficient, for example, to assume 
that the function f(x, y) is continuous and has a bounded dérivative with 
respect to y on the whole domain G. In this case it may be proved that 
through each point of G there passes one and only one intégral curve and 
that every sequence of Euler broken linespassing through the point (x 0 , y 0 ) 
converges uniformly to this unique intégral curve, as the length of the 
longest link of the broken Unes tends to zéro. Thus for sufficiently small 
links the Euler broken line may be taken as an approximation to the 
intégral curve of équation (34). 

From the preceding it can be seen that the Euler broken lines are so 
constituted that small pièces of the intégral curves are replaced by line 
segments tangent to these intégral curves. In practice, many approxima¬ 
tions to intégral curves of the differential équation (34) consist not of 
straight-line segments tangent to the intégral curves, but of parabolic 
segments that hâve a higher order of tangency with the intégral curve. 
In this way it is possible to find an approximate solution with the same 
degree of accuracy in a smaller number of steps (with a smaller number of 
links in the approximating curve). The coefficients of the équation for 
the (higher order) parabola 

y = a 0 + a,(x - x k ) -I- a 2 (x - x k Y + — + a„(x - x k ) n , (35) 

which at the point (x 4 ., y k ) has nth-order tangency with the intégral 
curves of équation (34) through this point, are given by the following 
formulas: 


flo = n-. 


(36) 


“■-ÆL.-*** (36 '> 

- (-S-L. - m„,. (IL, 


= fx(x k , y*) + /,'(**, y k )f(x k , y k ), (36’) 

= fxUxt . T*) + 2f'y(x k , y k )f(x k , y k ) 

+fv»(x k , ,v k )A x t . yt) +fv\* k , y k )f(x k , y*) 


+ fi(x k ,y k )f' x (x k ,y k ). 


(36"') 
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The polynomial (35) is needed only in order to compute its value for 
x = x M . The actual values of the coefficients a 0 , a t , a 2 , a n them- 
selves are not needed. There are many ways of computing the value for 
x = x kfl of the polynomial (35) whose coefficients are given by formula 
(36), without computing the coefficients a 0 , a,, —, a„ themselves. 

Other approximation methods exist for finding the solution of the 
differential équation (34), which are based on other ideas. One convenant 
method was developed by A. N. Krylov (1863-1945). 

The method of successive approximations. We now describe another 
method of successive approximation, which is as widely used as the method 
of the Euler broken lines. We assume again that we are required to find 
a solution y(x) of the differential équation (34) satisfying the initial condi¬ 
tion 

y(* o) = To • 

For the initial approximation to the function y(*), we take an arbitrary 
function yj,x). For simplicity we wiil assume that it also satisfies the 
initial condition, although this is not necessary. We substitute it into the 
right side f(x, y) of the équation for the unknown function y and construct 
a first approximation y, to the solution y from the following requirements: 

dy. 

-fc = /l*. To(*)]> Ti(*o) = >o • 

Since there is a known function on the right side of the first of these 
équations the function y,(x) may be found by intégration: 

Ti(*) = To + f /[/, To(0] dl. 

J *0 

It may be expected that y,(x) wiil differ from the solution y(j:) by less than 
y<^x) does, since in the construction ofy,(jr) we made use of the differential 
équation itself, which should probably introduce a correction into the 
original approximation. One would also think that if we improve the 
first approximation y,(x) in the same way, then the second approximation 

TïW = To + f /Il y,(/)] dl 

J *0 

wiil be still doser to the desired solution. 

Let us assume that this process of improvement has been continued 
indefinitely and that we hâve constructed the sequence of approximations 

ToW, TiW. — . TnW, — • 



340 


V. ORDINARY D1FFERENT1AL EQUATIONS 


Will this sequence converge to the solution ><x)? 

More detailed investigations show that if f(x,ÿ) is continuous and/„' is 
bounded in the domain G, the functions y„(x) will in fact converge to the 
exact solution y(x) at least for ail x sufficiently close to x 0 and that if we 
break of the computation after a sufficient number of steps, we will be 
able to find the solution y(x) to any desired degree of accuracy. 

Exactly in the same way as for the intégral curves of équation (34), 
we may also find approximations to intégral curves of a System of two or 
more differential équations of the first order. Essentially the necessary 
condition here is to be able to solve these équations for the dérivatives of 
the unknown functions. For example, suppose we are given the system 

= /i(*. y> z ). ^ = f* x < y< *)• (37) 

Asuming that the right sides of these équations are continuous and 
hâve bounded dérivatives with respect to y and z in some domain G in 
space, it may be shown under these conditions that through each point 
(*o . y<> • z o) of the domain G, in which the right sides of the équations in 
(37) are defined, there passes one and only one intégral curve 

y = #*). 

of the system (37). The functions f t (x, y, z ) and f t (x, y, z) give the direction 
numbers at the point (x, y, z), of the tangent to the intégral curve passing 
through this point. To find the functions <f>(x) and i/j(x) approximately, 
we may apply the Euler broken line method or other methods similar 
to the ones applied to the équation (34). 

The process of approximate computation of the solution of ordinary 
differential équations with initial conditions may be carried out on 
computing machines. There are electronic machines that work so rapidly 
that if, for example, the machine is programmed to compute the trajectory 
of a projectile, this trajectory can be found in a shorter space time than it 
takes for the projectile to hit its target (cf. Chapter XIV). 


The connection between differential équations of various orders and a 
system of a large number of équations of first order. A system of or¬ 
dinary differential équations, when solved for the dérivative of highest 
order of each of the unknown functions, may in general be reduced, by 
the introduction of new unknown functions, to a system of équations of 
the first order, which is solved for ail the dérivatives. For example, consider 
the differential équation 


( 38 ) 
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We set 

dy 

dx ~ Z ' 

Then équation (38) may be written in the form 


7 


(39) 


(40) 


Hence, to every solution of équation (38) there corresponds a solution 
of the system consisting of équations (39) and (40). It is easy to show that 
to every solution of the system of équations (39) and (40) there corresponds 
a solution of équation (38). 


Equations not explicitly containing the independent variable. The 

problems of the pendulum, of the Helmholtz acoustic resonator, of a 
simple electric circuit, or of an electron-tube generator considered in §1 
lead to differential équations in which the independent variable (time) 
does not explicitly appear. We mention équations of this type here, because 
the corresponding differential équations of the second order may be 
reduced in each case to a single differential équation of the first order 
rather than to a system of first-order équations as in the paragraph 
above for the general équation of the second order. This réduction greatly 
simplifies their study. 

Let us then consider a differential équation of the second order, not 
containing the argument t in explicit form 


We set 


Flx - x - 
\ ’ dt ' 


— o 

dt 2 ) 


dx 

T, = y 


(41) 

(42) 


and consider y as a function of x, so that 

d*x d_ / dx , dy dy^ dx dy 
dt 2 dt v dt / dt ~ dx dt y dx' 


Then équation (41) may be rewritten in the form 


o- 


(43) 


In this manner, to every solution of équation (41) there corresponds a 
unique solution of équation (43). Also to each of the solutions y = <f>(x) 
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of équation (43) there correspond infinitely many solutions of équation 
(41). These solutions may be found by integrating the équation 

d -j t » #*). (44) 


where x is considered as a function of t. 

It is clear that if this équation is satisfied by a function x = x(t), then 
it will also be satisfied by any function of the form *(/ +■ t 0 ), where /„ is 
an arbitrary constant. 

It may happen that not every intégral curve of équation (43) is the graph 
of a single function of x. This will happen, for example, if the curve is 
closed. In this case the intégral curve of équation (43) must be split up 
into a number of pièces, each of which is the graph of a function of x. 
For every one of these pièces, we hâve to find an intégral of équation (44). 

The values of x and dxfdt which at each instant characterize the State 
of the physical System corresponding to équation (41) are called the 
phases of the system, and the (*, y) plane is correspondingly called the 
phase plane for équation (41). To every solution x = x(i) of this équation 
there corresponds the curve 


x = x{t), y = x'(r) 


in the (x, y) plane; t here is considered as a parameter. Conversely, to 
every intégral curve y = <f>(x) of équation (43) in the (x, y) plane there 
corresponds an infinité set of solutions of the form x = x(i + t 0 ) for 
équation (41); here i 0 is an arbitrary constant. Information about the 
behavior of the intégral curves of équation (43) in the plane is easily 
transformed into information about the character of the possible solutions 
of équation (41). Every closed intégral curve of équation (43) corresponds, 
for example, to a periodic solution of équation (41). 

If we subject équation (6) to the transformation (42), we obtain 


dy _ —ay — bx 
dx my 


(45) 


Setting v = x and dv/dt = y in équation (16), in like manner we get 


, dy _ — [Æ — A/(a, + 2a*x + Sa.,* 2 )! y — x 
dx y 


Just as the State at every instant of the physical system corresponding to 
the second-order équation (41) is characterized by the two magnitudes* 


* The values of d'x/di *, cPx/dl *, — at the same instant of time are defined by the 
values of x and dx/dl from équation (41) and from the équations obtained from (45) 
by différentiation (cf. formula (36)). 
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(phases) x and y = dx/dt, the State of a physical system described by 
équations of higher order or by a system of difTerential équations is 
characterized by a larger number of magnitudes (phases). Instead of a 
phase plane, we then speak of a phase space. 


§6. Singular Points 

Let the point P(x, y) be in the interior of the domain G in which we 
consider the difTerential équation 


dy = M(x, y) 
dx N(x, y) ' 


(47) 


If there exists a neighborhood R of the point P through each point of 
which passes one and only one intégral curve (47), then the point P is 
called an ordinary point of équation (47). But if such a neighborhood does 
not exist, then the point P is called a singular point of this équation. The 
study of singular points is very important in the qualitative theory of 
difTerential équations, which we will consider in the next section. 

Particularly important are the so-called isolated singular points, i.e., 
singular points in some neighborhood of each of which there are no other 
singular points. In applications one often encounters them in investigating 
équations of the form (47), where M{x, y) and N(x, >0 are functions with 
continuous dérivatives of high orders with respect to x and y. For such 
équations, ail the interior points of the domain at which M(x, y) 0 or 
N(x, y) ^ 0 are ordinary points. Let us now consider any interior point 
(■*(>• .ko) where M(x,y) = N(x,y) = 0. To simplify the notation we will 
assume that x 0 = Oandy 0 = 0. Thiscan alwaysbe arranged by translating 
the original origin of coordinates to the point (jr 0 , ,y 0 ). Expanding M(x, y) 
and N(x, y) by Taylor’s formula into powers of x and y and restricting 
ourselves to ternis of the first order, we hâve, in a neighborhood of the 
point (0, 0), 


dy = M' x (0, 0) x + M'^0, 0) y 4- <f >■(*, y) 
dx N' x (0, 0) .x + N' y ( 0, 0) >- + 4> 2 (x, y) ’ 


where <f >,(*, y) and <f>£x, y) are functions of x and y for which 
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Equations (45) and (46) are of this form. Equation (45) does not define 
either dyldx or dx/dy for x = 0 and y = 0. If the déterminant 

M'A 0.0) M'AO.O) 

tf;(0,0) AT( 0,0) 


then, whatever value we assign to dy/dx at the origin, the origin will be a 
point of discontinuity for the values dy/dx and dx/dy, since they tend to 
different limits depending on the manner of approach to the origin. 
The origin is a singular point for our differential équation. 

It has been shown that the character of the bchavior of the intégral 
curves near an isolated singular point (here the origin) is not influenced 
by the behavior of the terms <f>,(x, y) and fax, y) in the numerator and 
denominator, provided only that the real part of both roots of the équation 


A - A/;(0,0) - M’ x (0, 0) 

- N'(0,0) A-/v;(0,0) 


= 0 


(49) 


is different from zéro. Thus, in order to form some idea of this behavior, 
we study the behavior near the origin of the intégral curves of the équation 


dy ax + by 
dx~ ex -f- dy 


for which the déterminant 


a b 
c d 


# 0 . 


(50) 


We note that the arrangement of the intégral curves in the neighborhood 
of a singular point of a differential équation has great interest for many 
problems of mechanics, for example in the investigation of the trajectories 
of motions near the equilibrium position. 

It has been shown that everywhere in the plane it is possible to choose 
coordinates £, -q, connected with x, y by the équations 


* = *«£+ * 12 *?. 
y = k 12 f + *22 V, 


(51) 
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where the k it are real numbers such that équation (50) is tranformed into 
one of the following three types: 


1 ) ^ = k 5, where k = -- . 
d£ £ 4, 

■y, *2 = ^ ± V 

’dÇ £ ' 

^ = P£ + on? 
d£ «i-Pr,' 

Here À, and A 2 are the roots of the équation 


(52) 

(53) 

(54) 


(55) 


If these roots are real and different, then équation (50) is transformed into 
the form (52). If these roots are equal, then équation (50) is transformed 
either into the form (52) or into the form (53), depending on whether 
a 2 + d 2 = 0 or a 2 4- d 2 0. If the roots of équation (55) are complex, 
A = a ± pi, then équation (51) is transformed into the form (54). 

We will consider each of the équations (52), (53), (54). To begin with, 
we note the following. 

Even though the axes Ox and Oy were mutually perpendicular, the axes 
0$ and Or) need not, in general, be so. But to simplify the diagrams, we 
will assume they are perpendicular. Further, in the transformation (51) 
the scales on the 0£ and Or) axes may be changed; they may not be the 
same as the ones originally chosen on the axes Ox and Oy. But again, 
for the sake of simplicity, we assume that the scales are not changed. 
Thus, for example, in place of the concentric circles, as in figure 8, 
there could in general occur a family of similar and similarly placed ellipses 
with common center at the origin. 

Ail intégral curves of équation (52) are given by a relation of the form 
ar, + b | f i* = 0, 

where a and b are arbitrary constants. 

The intégral curves of équation (52) are graphed in figure 10; here we 
we hâve assumed that k > 1. In this case ail intégral curves except one, 
the axis Or), are tangent at the origin to the axis 0£. The case 0 < k < 1 
is the same as the case k > 1 with interchange of £ and 77, i.e., we hâve 
only to interchange the rôles of the axes £ and 77. For k = 1 , équation 
(52) becomes équation (30), whose intégral curves were illustrated in 
figure 7. 
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An illustration of the integra! curves of équation (52) for k < 0 is given 
in figure 11. In this case we hâve only two intégral curves that pass through 
the point O: these are the axis OÇ and the axis Oq. Ail other intégral 



curves, after approaching the origin no doser than to some minimal 
distance, recede again from the origin. In this case we say that the point 
O is a saddle point because the intégral curves are similar to the contours 
on a map representing the summit of a mountain pass (saddle). 

Ail intégral curves of équation (53) are given by the équation 

h = + b In | f |), 

where a and b are arbitrary constants. These are illustrated schematically 
in figure 12; ail of them are tangent to the axis O-q at the origin. 

If every intégral curve entering some neighborhood of the singular 
point O passes through this point and has a definite direction there, i.e., 
has a definite tangent at the origin, as is illustrated in figures 10 and 12, 
then we say that the point O is a node. 

Equation (54) is most easily integrated, if we change to polar coordinates 
p and <f>, putting 

f = p cos <f>, -q = p sin <f>. 

Then this équation changes into the équation 


T 7 = kp, where k = - 
d<f> P 
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P = Ce**. (56) 

If A: > 0 then ail the intégral curves approach the point O, winding 
infinitely oflen around this point as ■/>-*- — oo (figure 13). If k < 0, 

n 

i 


Fig. 12. Fig. 13. 

then this happens for <f>— + oo. In these cases, the point O is called a 
focus. If, however, k = 0, then the collection of intégral curves of (56) 
consists of curves with center at the point O. Generally, if some neighbor- 
hood of the point O is completely filled by closed intégral curves, sur- 
rounding the point O itself, then such a point is called a center. 

A center may easily be transformed into a focus, if in the numerator 
and the denominator of the right side of équation (54) we add a term of 
arbitrarily high order; consequently, in this case the behavior of intégral 
curves near a singular point is not given by terms of the first order. 

Equation (55), corresponding to équation (45), is identical with the 
characteristic équation (19). Thus figures 10 and 12 schematically 
represenl the behavior in the phase plane ( x , y) of the curves 

x = x(t), y = x'(t), 

corresponding to the solutions of équation (6) for real À, and A 2 of the 
same sign; Figure 11 corresponds to real A, and A 2 of opposite signs, and 
figures 13 and 8 (the case of a center) correspond to complex A! and A 2 . 
If the real parts of A, and A 2 are négative, then the point (x(t), y(t)) ap- 
proaches 0 for t -+ + co; in this case the point x — 0, y = 0 corresponds 
to stable equilibrium. If, however, the real part of either of the numbers 
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A, and A 2 is positive, then at the point x = 0, y = 0, there is no stable 
equilibrium. 


§7. Qualitative Theory of Ordinary Differential Equations 

An important part of the general theory of ordinary differential équa¬ 
tions is the qualitative theory of differential équations. It arose at the end 
of the last century from the requirements of mechanics and astronomy. 

In many practical problems, it is necessary to establish the character 
of the solution of a differential équation describing some physical process 
and to describc the properties of its solutions as the independent variable 
ranges over a finite or infinité interval. For example, in celestial mechanics, 
which studies the motion of heavenly bodies, it is important to hâve 
information about the behavior of the solutions of differential équations 
describing the motion of the planets or other heavenly bodies for 
unbounded periods of time. 

As we said earlier, for only a few particularly simple équations can a 
general solution be expressed in terms of intégrais of known functions. 
So there arose the problem of investigating the properties of the solutions 
of a differential équation from the équation itself. Since the solution of a 
differential équation is given in the form of a curve in a plane or in space, 
the problem consisted of investigating the properties of intégral curves, 
their distribution and their behavior in the neighborhood of singular 
points. For example, do they lie in a bounded part of the plane or do they 
hâve branches tending to infinity, are some of them closed curves, and 
so forth ? The investigation of such questions constitutes the qualitative 
theory of differential équations. 

The founders of the qualitative theory of differential équations are the 
Russian mathematician A M. Ljapunov and the French mathematician 
H. Poincaré. 

In the preceding section, we considered in detail one of the important 
questions of the qualitative theory, namely the distribution of intégral 
curves in a neighborhood of a singular point. We turn now to some other 
basic questions in qualitative theory. 

Stability. In the examples considered at the beginning of the chapter, 
the question of stability or instability of the equilibrium of a System was 
easily answered from physical considérations, without investigating the 
differential équations. Thus in example 3 it is obvious that if the pendulum, 
in its equilibrium position OA, is moved by some external force to a nearby 
position OA', i.e., if a small change is made in the initial conditions, then 
the subséquent motion of the pendulum cannot carry it very far from the 
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equilibrium position, and this déviation will be smaller for smaller original 
déviations OA', i.e., in this case the equilibrium position will be stable. 

For other more complicated cases, the question of stability of the 
equilibrium position is considerably more complicated and can be dealt 
with only by investigating the corresponding differential équations. The 
problem of the stability of equilibrium is closely connected with the 
question of the stability of motion. Fundamental resulls in this field were 
established by A. M. Ljapunov. 

Let some physical process be described by the system of équations 


H-H*'** 

= /*(■*. y. 0- 


(57) 


For simplicity, we consider only a system of two differential équations, 
although our conclusions remain valid for a system with a larger number 
of équations. Each particular solution of the system (57), consisting of 
two functions x(t) and y(t), will sometimes be called a motion, following 
the usage of Ljapunov. We will assume that /,(*, y, t ) and f^x,y, t) hâve 
continuous partial dérivatives. It has been shown that, in this case, the 
solution of the system of differential équations (57) is uniquely defined if 
at any instant of time t = /„ the initial values x(t 0 ) = ,x 0 and y(/ 0 ) = y 0 
are given. 

We will dénoté by x(t, x 0 , y 0 ) and y(/, x 0 , y 0 ) the solution of the system 
of équations (57) satisfying the initial conditions 

x = x 0 and y = y 0 for t = t 0 . 

A solution x(t, x 0 , y 0 ), y(t, x 0 , y 0 ) is called stable in the sense of Ljapunov 
if for ail t > t 0 the functions x{t, x 0 , y 0 ) and y(t, x 0 , y 0 ) hâve arbitrarily 
small changes for sufficiently small changes in the initial values x 0 and y 0 . 

More exactly, for a solution to be stable in the sense of Ljapunov, the 
différences 

| x(t, x 0 + S,, y 0 + S 2 ) - x(t, x 0 , y 0 ) \, 

(58) 

| y(t, x 0 + 8,, y„ + S 2 ) - y(t, x 0 , y 0 ) | 

may be made less than any previously given number e for ail t > t 0 , if 
the numbers 8, and 8 2 are taken sufficiently small in absolute value. 

Every motion that is not stable in the sense of Ljapunov is called 
unstable. 
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In his investigation, the motion x 0 , y 0 ) and y(t, x„ , y 0 ) was called 
by Ljapunov unperturbed, and the motion x(t, x 0 + 8,, y 0 + d 2 ), 
y(t, x 0 + S 1 , y 0 + S 2 ) with nearby initial conditions was called perturbed. 
In this way stability in the sense of Ljapunov for an unperturbed motion 
means that for ail ( > /„ the perturbed motion must differ only a little 
from the unperturbed. 

The stability of equilibrium is a spécial case of stability of motion, 
corresponding to the case in which the unperturbed motion is 

x(t, x 0 , y 0 ) = 0 and y(t, x 0 , y 0 ) s 0 . 

Conversely, the question of the stability of any motion x = tf> t (t) and 
y = 4> 2 (t) of the system (57) may be reduced to the question of the stability 
of equilibrium for some system of differential équations. To this end we 
replace the unknown functions x(t) and y(r) in the system (57) by the 
new unknown functions 

Ç = x — <£,(0 and 7 ]= y - <f> 2 (t). (59) 

In the system (57) transformed in this way, the motion x = and 
y — <f> 2 (t) will correspond to the motion f = Oand r/ = 0, i.e., the position 
of equilibrium. In what follows we will everywhere assume that the 
transformation (59) has been made, so that we may consider stability 
in the sense of Ljapunov only for the solution x = 0, y = 0. 

The condition of stability in the sense of Ljapunov now means that, 
for 8, and S 2 sufficiently small and t > t 0 , the trajectory in the (x, y) 
plane of a perturbed motion does not pass outside of the square with 
sides of length 2 parallel to the coordinate axes and with center at the point 
x = 0, y = 0. 

We will be interested in those cases in which, without knowing an 
integra! of the system (57), we can nevertheless arrive at conclusions about 
the stability or instability of a motion. Stability is a very important 
practical question in the motion of projectiles, or of aircraft; and the 
stability of orbits is important in celestial mechanics, where the motion 
of planets and other heavenly bodies leads to this kind of investigation. 

We assume that the functions/jfx.y, /) and f 2 (x,y , t) may be represented 
in the form 


fi(x, y, i) = a u x + a 12 y + R,(x, y, t), 
ft(x, y, I) = a 2l x + fl a v + R 2 (x, y, t). 


(60) 
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where the a tj are constants, and R t (x, y,t) and R 2 (x, y, t) are functions of 
x, y , and t such that 


I Æi(*. Y, 0 I < M(x* + f) and | R 2 (x, y, t) 1 < A/(^ + y 2 ), (61) 


where M is a positive constant. 

If in the system (57) we substitute équations (60), neglecting R t (x,y, t) 
and R^x, y, t ), we get a System of differential équations with constant 
coefficients 

dx 

— - a n x + a 12 )’. 


dy 

dt 


a 2t x 


+ 1h 2 y. 


(62) 


which is called the System of first approximation to the nonlinear System (57). 

Before the time of Ljapunov, researches confined themselves to investi- 
gating stability of the first approximation, believing that the results 
obtained would carry over to the question of stability for the basic non¬ 
linear system (57). Ljapunov was the first to show that in the general case 
this conclusion is false. On the other hand, he gave a sériés of very wide 
conditions under which the question of stability for the nonlinear system is 
completely solved by the first approximation. One of these conditions is 
the following. If the real parts of both the roots of the équation 

| a„ — A a l2 I _ q 
I a 2l a 22 — A I 

are négative and the functions Æ,(;r, y, t) and R^x, y, t) fulfill condition 
(61), then the solution x(t) = 0, y(r) == 0 is stable in the sense of Ljapunov. 
If the real part of either of the roots is positive, then the solution x(t) = 0, 
y(t) = 0 of an équation satisfying the conditions (61 ) is unstable. Ljapunov 
also gave a sériés of other sufficient conditions for stability and instability 
of a motion.* 

If the right sides of équations (57) do not dépend on t, then dividing 
the first équation of the system (57) by the second we get 


dy /■(*, y) 

dx f 2 (x, y) ' 


(63) 


The origin will be a singular point for this équation. In the case of stability 
of equilibrium, this point may be a focus, a node, or a center, but cannot 
be a saddle point. 


* A. M. Ljapunov, The general problem of stability of motion. 
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Thus the character of a singular point may be determined from the 
stability or instability of the equilibrium position. 

The behavior of intégral curves in the large. It is sometimes impor¬ 
tant to construct a schematized représentation of the behavior of the 
intégral curves “in the large”; that is, in the entire domain of the given 
system of differential équations, without attempting to préserve the scale. 
We will consider a space in which this system defines a field of directions 
as the phase space of some physical process. Then the general scheme of 
the intégral curves, corresponding to the system of differential équations, 
will give us an idea of the character of ail processes (motions) which can 
possibly occur in this system. In figures 10-13 we hâve constructed 
approximate schematized représentations of the behavior of the intégral 
curves in the neighborhood of an isolated singular point. 

One of the most fundamental problems in the theory of differential 
équations is the problem of finding as simple a method as possible for 
constructing such a scheme for the behavior of the family of intégral 
curves of a given system of differential équations in the entire domain 
of définition, in order to study the behavior of the intégral curves of this 
system of differential équations “in the large.” This problem remains 
almost untouched for spaces of dimension higher than 2. Ft is still very 
far from being solved for the single équation of the form 


_ M(x, y) 
dx N(x, >-) 


(64) 


even when M(x, y) and N(x, y) are polynomials. 

In what follows, we will assume that the function M(x, y) and N(x, y) 
hâve continuous partial dérivatives of the first order. 

If ail the points of a simply connected domain G, in which the right side 
of the differential équation (64) is defined, are ordinary points, then the 
family of intégral curves may be represented schematically as a family 
of segments of parallel straight fines; since in this case one intégral curve 
will pass through each point, and no two intégral curves can intersect. 
For an équation (64) of more general form, which may hâve singular 
points, the structure of the intégral curves may be much more complicated. 
The case in which équation (64) has an infinité set of singular points (i.e., 
points where the numerator and the denominator both vanish) may be 
excluded, at least when M(x, y) and N(x, y) are polynomials. Thus we 
restrict our considération to those cases in which équation (64) has a 
finite number of isolated singular points. The behavior of the intégral 
curves that are near to one of these singular points forms the essential 
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element in setting up a schematized représentation of the behavior of ail 
the intégral curves of the équation. 

A very typical element in such a scheme for the behavior of ail the 
intégral curves of équation (64) is formed by the so-called limit cycles. 
Let us consider the équation 



(65) 


where p and <f> are polar coordinates in the (x, y) plane. 

The collection of ail intégral curves of équation (65) is given by the 
formula 

P = I + Ce*, (66) 

where C is an arbitrary constant, different for different intégral curves. 
In order that p be nonnegative, it is necessary that <f> hâve values no larger 
than — In | C |, C < 0. The family of intégral curves will consist of 

1. thecircfep= 1 (C = 0); 

2. the spirals issuing from 
the origin, which approach 
this circle from the inside as 
<f> — — oo (C < 0); 

3. the spirals, which ap¬ 
proach the circle p = 1 
from the outside as <f> -► — oo 
(C>0) (figure 14). 

The circle p = 1 is called 
a limit. cycle for équation 
(65). In general a closed 
intégral curve / is called a 
limit cycle, if it can be 
enclosed in a dise ail points 
of which are ordinary for 
équation (64) and which is 
entirely filled by nonclosed intégral curves. 

From équation (65) it can be seen that ail points of the circle are ordinary. 
This means that a small piece of a limit cycle is not different from a small 
piece of any other intégral curve. 

Every closed intégral curve in the (x, y) plane gives a periodic solution 
l*(0. A')] of the system 
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describing the law of change of some physical System. Those intégral 
curves in the phase plane that as t -*• + oo approximate a limit cycle are 
motions that as t -* + co approximate periodic motions. 

Let us suppose that for every point (x 0 , y 0 ) sufficiently close to a limit 
cycle /, we hâve the following situation: If (x 0 , y 0 ) is taken as initial point 
(i.e., for t = /<,) for the solution of the system (67), then the corresponding 
intégral curve traced out by the point [x(r), ></)], as t -*• + oo approximates 
the limit cycle / in the (x, y) plane. (This means that the motion in question 
is approximately periodic.) In this case the corresponding limit cycle is 
called stable. Oscillations that act in this way with respect to a limit cycle 
correspond physically to self-oscillations. In some self-oscillatory Systems, 
there may exist several stable osciilatory processes with different ampli¬ 
tudes, one or another of which will be established by the initial conditions. 
In the phase plane for such “self-oscillatory Systems,” there will exist 
corresponding limit cycles if the processes occuring in these Systems are 
described by an équation of the form (67). 

The problem of finding, even if only approximately, the limit cycles of 
a given differential équation has not yet been satisfactorily solved. The 
most widely used method for solving this problem is the one suggested by 
Poincaré of constructing “cycles without contact.” It is based on the 
following theorem. We assume that on the (x, y) plane we can find two 
closed curves Z., and L 2 (cycles) which hâve the following properties: 

1. The curve L 2 lies in the région enclosed by L,. 

2. In the annulus D, between L, and L , , there are no singular points of 
équation (64). 

3. L, and L 2 hâve tangents everywhere, and the directions of these 
tangents are nowhere identical with the direction of the field of directions 
for the given équation (64). 

4. For ail points of L, and the cosine of the angle between the 
interior normals to the boundary of the domain Q and the vector with 
components [#(*, y), M(x, ÿ)] never changes sign. 

Then between L, and L 2 , there is at least one limit cycle of équation 
(64). 

Poincaré called the curves L x and L 2 cycles without contact. 

The proof of this theorem is based on the following rather obvious fact. 
We assume that for decreasing t (or for increasing t) ail the intégral curves 

x = x(t), y = y(r) 

of équation (64) (or, what amounts to the same thing, of équations (67), 
where t is a parameter), which intersect L x or , enter the annulus Q 
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between L, and L 2 . Then they must necessarily tend to some closed curve 
/ lying between L, and Z^; since none of the intégral curves lying in the 
annulus can leave it, and there are no singular points there. 

But the problem of finding cycles without contact is also a complicated 
one and no general methods are known for solving it. For particular 
examples it has been possible to find cycles without contact, thereby 
proving the existence of limit cycles. 

In radio technology it is important to find limit cycles (self-oscillatory 
processes) for équation (16) for the electron-tube generator. For équations 
of the type of (16), N. M. Krylov and N. N. Bogoljubov gave a method, 
about twenty years ago, for approximate computation of a certain limit 
cycle that exists for this équation. At about the same time the Soviet 
physicists L. I. Mandel’stam, N. D. Papaleksi, and A. A. Andronov gave 
a proof of the possibility of applying what is called the method of the 
small parameter, a method that to some extent had been used earlier in 
practice, though without any rigorous justification. Andronov was also 
the first to make systematic practicai use, in the analysis of self-oscillatory 
Systems, of the theoretical methods already developed by Ljapunov and 
Poincaré. In this manner he obtained a whole sériés of important results. 

As was mentioned earlier, an important rôle is played in physics by 
“insensitive” Systems (cf. §3). Andronov, together with L. S. Pontrjagin, set 
up a catalogue of the éléments from which one could construct a complété 
chart of the behavior of the intégral curves in the (*, y) plane for an 
insensitive differential équation of the form (64). It had been long known, 
for example, that a center near a singular point is easily destroyed by 
small changes in the équations (64). Thus in the construction of a chart of 
the behavior of the intégral curves of équation (64), we cannot hâve a 
center, i.e., a family of closed intégral curves surrounding a singular point, 
if the équation is “insensitive.” 

The question of the behavior of the intégral curves in the large is still 
far from its final solution. We note that the analogous and probably 
simpler question of the form of real algebraic curves in the plane, i.e., 
curves defined by the équation 

P(x, y) = 0, 

where P(x, ÿ) is a polynomial of degree n, is also far from a complété 
solution. The form of these curves is completely known only for n < 6. 

The solutions of the system (64) define motions in the plane. If we 
replace each point (* 0 , y 0 ) in the plane by the corresponding point 
W', * 0 , y„), y(t, x 0 , y 0 )], where x(t,x 0 ,y 0 ) and y(t,x 0 ,y 0 ) are the 
solution of the system (64) with initial conditions x = x 0 and y = y 0 
for t = t 0 , we obtain a transformation of the points of the plane depending 
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on the parameter t. Similar transformations depending on a parameter, 
together with the motions they generate, may be considered on a sphere, 
a torus, or other manifolds. The properties of these motions are studied 
in the theory of dynamical Systems. In a neighborhood of every point 
these motions are the solutions of some System of differential équations. 
In the past decade the theory of dynamical Systems has been developed 
on a broad basis in the works of V. V. Stepanov, A. Ja. Hinéin, N. N. 
Bogoljubov, N. M. Krylov, A. A. Markov, V. V. Nemyckil and others, 
and also in the works of G. D. Birkhoff and other mathematicians. 

In this chapter we hâve given a brief outline of the présent state of the 
theory of ordinary differential équations and hâve attempted to describe 
the problems that are considered in this theory. Our study in no sense 
prétends to be complété. We hâve had to omit considération of many 
branches of the theory that arise in the study of more spécial problems 
or that require broader mathematical knowledge than the reader of this 
book is assumed to possess. For example, we hâve nowhere touched 
upon the general and important area in which one the theory of differential 
équations with complex argument is considered. We hâve had no oppor- 
tunityto examine the theory ofboundary-value problems and in particular, 
of eigenfunctions, which is of great importance in the applications. 

We hâve also been able to pay very little attention to approximative 
methods for the numerical or analytical solution of differential équations. 
For these questions, we recommend that the reader consult the specialized 
literature. 
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